Author Archives: feltham

New Skill Unlocked: Targeted Googling

Formally, the Fieldschool taught us many many things – from the principles of DH project management to the difference between JavaScript’s ‘methods’ and ‘functions’ – but inevitably some skills arose a little more, um, organically. One great skill I picked-up related to problem-solving and troubleshooting my own (frequent) mistakes. The tech/dev team really honed a strategy that we called ‘targeted Googling’. We should have given it a more dynamic name but at least it’s functional. Anyway, this phrase just refers to the process by which we solved almost all of our technical problems: just Google the heck out of it.

This sounds both facetious and really obvious but I recommend it quite seriously (and it won’t be any kind of secret to people who work permanently in programming or web development). There is definitely a knack to targeted Googling too. For those of us who usually use Google to solve trivia debates, it’s a different kettle of fish using one search bar to explain and elicit a useful response to why you think you just broke an entire web-page and/or visualisation.

The first challenge is understanding what’s wrong and how to describe your problem correctly. It was a great day when I realised that a web browser’s JavaScript console provides you with an error message: it wasn’t just a case of ‘timeslider not appearing leaflet plugin’ but that, in the plugin’s JavaScript, the ‘variable SliderControl not defined’. Until I discovered this, I kept trying to describe the errors in terms of how they presented as symptoms, using (often uselessly) broad terms. But the trick to getting useful results is describing where something went wrong – and the error message tells you! My overwhelming recommendation to other newbies: find the error message. It will become your best friend and ally.

The other troublesome step is interpreting the hits you get back. The results are inevitably forum after forum after comment thread after forum, but somewhere in there will be a response to exactly your problem and the piece of correct code you need. I learned to read more than the first page of hits, look at all of the forums and try all of the solutions. I wish there was a more polite way of saying ‘trial and error’ but implementing the solution is just that: you have to get used to persevering and somewhat blindly plugging pieces of code in that maaay be right (though you’re not sure why).

The last point/caveat is this: unfortunately, knowing all of this still doesn’t mean that I can fix my own problems every time (that’s a whole different story) but knowing what’s wrong, using the right terminology and feeling comfortable being a little bit blind is a significant start to getting the answer you need.

Messrs. TileMill, MapBox & MapBox.js

1 Reply

Last week was an incredibly busy week at the Fielsdschool. The first half was dedicated to planning our final project as a group; while the second half shifted to working in teams (Tech, Design and Content) to start building. By Friday, the tech team had managed to successfully generate, wrangle and clean a formidable bucket of data, and were moving quickly onto constructing visualisations. It’s not too much of a giveaway to say that one of the visualisation toolkits we’re using is a TileMill, Mapbox and Mapbox.js combination. These 3 tools provide the best way for us to build beautiful map tiles and map layers offline (TileMill), host them online (Mapbox) and carefully customise their functionality (Mabox.js). So far, I am finding a lot to reflect on and/or value in these tools:

The ‘return on investment’: The objects you can create using TileMill are disproportionately delightful (there’s no other word for it) to the amount of work it takes to gain and apply the skills needed to use the application. On Friday morning, we built two test maps that meet our purpose, look distinctive and contain great features like image pop-ups. To do this on top of TileMill’s most basic functionality required some targeted Googling and roughly half an hour. For me, this makes TileMill very valuable: while it isn’t necessarily easy-to-use straight ‘out of the box’ without some existing knowledge (i.e. how digital maps are put together and basic HTML + CSS), the reward for learning those basic skills and applying what feels like only slightly more effort is incredibly significant.

Learning in context: Working with these tools for the final project has reinforced crucial skills in a context that means these skills are more likely to stick. This can be contrasted against Codecademy’s one flaw that another Fieldschooler pointed out earlier: learning skills in a vacuum can undermine effectively deploying those skills outside of that vacuum. So, while I may know how to call a JavaScript function, I might not have a complete picture of when or why I do that. Happily, though, using TileMIll + MapBox for a distinct purpose means that I am gaining, for example, a more holistic idea of problem solving; and has meant that I know where the rest of the machinery fits in (i.e. how to test your hosted map before you upload it permanently).

(Always) more to learn: Tuesday’s plan is to leap boots n’ all into MapBox.js. It will be interesting to see how this experience shapes my reflections on this set of tools since, generally speaking, I have found MapBox.js significantly harder to use than TileMill or MapBox. Watch this space for an update on how using a JS Library challenges my delight.

Getting to know your data

Leave a reply

This week I was on a project team that learned an incredibly important lesson about creating data-driven visualisations: get to know your data really well before you get started. Any visualisation you build is considerably sculpted, not only by the meaning implicit the data, but also by how the data has been captured. The issue we faced was trying to explore a story that depended on finely granulated geocoding, but we only realised too late that our data reflected a ‘coarser grain’ of location. For another visualisation this would have been perfect – our issue is no reflection on the quality of the (super cool) data we were using – but, in our context, the data clearly didn’t function with the narrative we wanted to construct and the meaning we wanted to convey. At the last minute we had to re-think our visualisation, and explore a completely different facet of the data. Though we managed to get a new project finished, the process made for some frazzling moments and a late night.

What makes ‘getting to know’ your data difficult?

Time pressure

We were working with 8000 records under significant time pressure, so wanted to dive straight into the building. But, though we began with a great idea for what we wanted to explore, our hurrying meant we didn’t take the time to carefully assess how the location data might function on a map; or how it might relate to the other fields we were trying to represent. It sounds painfully simple but on the next project, I would take the time early on to speculate on these subjects. With a deadline looming, it would be less hectic to miss out some final features than have to re-think the visualization.

Starting with an undefined idea of what you (ideally) want to do explore.

It’s oxymoronic: the final story and visualisation must appropriately reflect how the underlying data has been captured; but, in order to assess your data you must have a clear idea of that story and the visualisation you want to build. We began with a general idea of what we wanted to convey. This meant that, while we conjectured about how it could look and work, we didn’t sharply envision the vital features needed to convey that idea. And… you can’t always work it as you go along. I think next time I would follow an iterative process: start with a clear idea, assess what foundational features demonstrate that effectively, check the data is capable of that, reshape the idea and so on. Again, taking the time for this at the beginning saves a whole world of coffee and confusion at 11pm.

Applying this lesson outside of the Fieldschool?

The data I work with on a day-to-day basis focuses on Wellington’s print history. My biggest dataset is spreadsheet upon spreadsheet containing details about late 19^th-century printers, publishers, booksellers and engravers. It is meticulously geocoded and able to be sliced by year or print service. At this stage, some records are geocoded by numbered street address but some only by street name. From this week’s project I can easily tell that there are limitations to what this data can tell me or how it will convey meaning most effectively. It could tell you an interesting story about the frequency of available print services in individual streets of the city. But – at this stage I would not be able to create a network diagram that links individual addresses.

Long time listener, first time hacker

Leave a reply

Last weekend I spent a fun few hours following Australia’s 2013 #GovHack on Twitter. Like the name suggests, this event aimed to encourage “open government and open data” by inviting teams to “mashup, reuse, and remix government data” at meetings held across the country. Unsurprisingly, there were some wonderful results. The theme of open data and reuse resonated strongly at the Fieldschool this week as we practised finding, extracting and manipulating not just Government statistics but any open and available data. I had thought that the technical skills needed to remix data from the web were out of my reach because I wasn’t a programmer but happily the Fieldschool proved me totally wrong. How did this happen?

1. Finding data is surprisingly simple.

Many organisations give away data in formats that are easy to interpret

Many governments (including the US, UK and New Zealand) provide giant datasets for people to reuse. Meanwhile, an increasing number of museums, galleries and online repositories are opening their data doors too. Often, all it takes is roaming around a website to find the ‘download data’ option. On top of this, data is often provided in formats that people can easily understand: a CSV file is no more complex than spreadsheet. I would hazard a guess that simply knowing useable data exists and that it can be, often, easily understood dismantles the first significant barrier to reuse.

APIs are incredible

Learning about APIs felt like being given the keys to the castle because they allow you to reuse data on-the-fly. To my non-programmer mind, APIs took a while to understand because you can only really ‘get’ how they work on their own terms (culprit #2: JavaScript functions) and the process of requesting data dynamically is more complex than downloading it once-off. APIs come in many flavors too, so you aren’t assured the same request and response format every time. But this week we learnt the basic recipe and despite the increased complexity I would never hesitate to use an API: I at least feel confident that I can figure it out.

You may be able to scrape it

Scraping is the process of extracting unstructured data from an HTML document (i.e. webpage) and structuring it so that it can be manipulated for visualisation. Our technique was so straightforward that all we needed was a Google spreadsheet and tabular data from Wikipedia. I did learn that it is not a fail-proof technique: my spreadsheet went a little bit haywire when I tried to scrape this table later on. (Bonus points for anyone who can figure out why it didn’t work).

2. Cleaning data is surprisingly fulfilling.

While we learnt that cleaning data is crucial to successfully reusing it, opinions vary on how enjoyable this process is. Using a powerful tool like OpenRefine meant that I was surprised at how enjoyable it was. If you enjoy meticulous activities like jigsaw puzzles or knitting then take my word for it: cleaning data is genuinely absorbing.

3. Meanwhile: Data licensing is incredibly important

One important point we learnt this week (if not from the Fieldschool, then from the media) is that data is not neutral or free floating. When remixing, you have to be aware of use limitations placed by the person providing the data. But, even then, licensing is not the impenetrable brick-wall that you think it might be. Navigating licensing can be as simple as familiarising yourself with the Creative Commons. A handy tip for the remainder of the Fieldschool is that visualisations are derivative copies.

3. Knowing what you want to do with data becomes wonderfully obvious

The last exciting discovery of this week is that I actually have ideas about what I’d like to make. I thought I’d have ‘hackers block’ about what to do with data, but I’m relieved to discover that’s definitely not the case. As soon as we learnt about various sources and techniques for extracting data, 1000 ideas appeared from nowhere. It obviously just took learning about what was possible for my mind to leap into action.

Essentially, while I can’t step out and immediately build the world’s best data-driven app, this week has proved that many of the barriers to remixing data I’d anticipated are roughly a day’s worth of (hard) concentrating away from being dismantled.

Hello, I’m Flora.

Leave a reply

Hi everyone! My name is Flora and I am a librarian/archivist/information manager by trade, training and inclination. Usually I live (back home) in New Zealand where I am a research assistant within an awesome team at Victoria University of Wellington’s Wai-te-ata Press. There, I work on digital history projects surrounding New Zealand’s early print culture and trade. I have a BA in English Literature and (oh so soon) also a Masters of Information Studies. I am captivated by data, metadata, and linked open data.

Like everyone at the Fieldschool, I am fascinated by how digital technology intersects with and impacts cultural heritage concerns and practices. Investigating digital cultural heritage is important for a million reasons, but a few points resonate strongly with me: firstly, a digital heritage environment allows us to engage with audiences outside institutional walls. Secondly, it radically alters how we perform those familiar cultural heritage practices (curation, preservation etc). And, lastly, it provides new ways of asking and answering historical or heritage questions. These changes provide countless opportunities for (and challenges to…) enriching cultural heritage, and since digital technology looks like it’s here to stay it’s critical to understand what these factors are and how they work!

If I was only allowed three questions I would ask: what happens to cultural heritage when you conceptualize it using digital tools? Practically speaking, how can digital tools augment heritage practices and inquiries? And most importantly, please can someone teach me ALL the technical know-how for creating high quality projects, okay? Thanks heaps!

I am really looking forward to getting under the hood of the WWW and further developing a technical toolkit. I also feel that knowing the nitty gritty of digital tools provides a great background for engaging with, creating and using digital cultural heritage thoughtfully and with a reflexive critical eye. I’m really excited about spending the new few weeks at the Fieldschool, learning new skills and talking shop with interesting, like-minded people. (Other times, when I’m not thinking about information, print culture and the web, I’m usually reading, knitting, doing yoga or going on walks and adventures!)

2013 Cultural Heritage Informatics Fieldschool

Visualization: Time, Space, and Data

Author Archives: feltham

New Skill Unlocked: Targeted Googling

Messrs. TileMill, MapBox & MapBox.js

Getting to know your data

Long time listener, first time hacker

Hello, I’m Flora.