Hello, World

07 May 2017 . category: Blogging . Comments

Welcome! This is my new blog space for documenting my personal journey deeper into the world of data analysis, data storytelling, and finding intelligence within data.

I am not sure what this journey will look like yet (although I have some ideas). What I do know is that the idea of finding a “signal in the noise” has always been appealing to me. Finding patterns drove much of my inner sense of play as a kid (like I assume it does for many children). Many bath times were spent preoccupied with finding hidden patterns in the “random” arrangement of seashells on the curtain that hung around the tub. In the kitchen, I would make up elaborate games of hopscotch to safely traverse the flower-patterned tiles strewn across the floor.

Patterns, and the underlying rules and logic for constructing them, eventually led me to an interest in Computer Science, which is all about humans applying rule-based logic to machines. As I see it, the task of a developer is to take a scenario, with all of its various interpretations, and essentially bucket all possible inputs to that scenario in a way that results in a set of predictable, desirable outcomes. Here’s some noise: find patterns, make it useful.

The web is noisy. Actually, it has many, many structures, as many as one for each page on the internet - but rarely is any of it structured for me. I became very interested in web scraping (extracting data from HTML) in high school and early college to solve for this. I chose to focus on Information during my Computer Science major - a focus that “synthesizes topics from across Computer Science that pertain to creating, processing and understanding digital information in the modern world” 1. I found natural language processing (NLP) fascinating.

But during college, I just barely dipped my toes into this interest in making sense of data. Then I did it again the year after undergrad with Coursera’s Machine Learning course from Andrew Ng. At Microsoft, I frequently attend the various data conferences and talks held for employees, by employees. On my team at OneNote, I often request projects having to do with interpreting user and usage data.

In short, I have dabbled a lot. Now I want to confidently take this interest to the next level. A few things have led me to this blog as my tool for exploration.

Learning in public

There have been projects like Jennifer Dewalt’s 180 websites in 180 days that showcase learning by doing, and further, doing it out in the open.

I like the idea of learning in public. The public part encourages sharing, community, transparency, and clarity. The learning part allows for missteps and incremental progress. I hope this blog will provide all of those elements (although, you know, the fewer missteps the better :smile:).

The Social Developer

I recently watched this talk about The Social Developer from Scott Hanselman2. Here are reasons Scott gives for becoming a “social developer”: 1) you have something to say; 2) you want to share a corpus of knowledge more effectively and with fewer keystrokes than direct, 1-to-1 messaging; 3) you want to build a reputation online; and/or 4) you want to build a positive community with a passion for learning.

Sounds good to me! Scott also explains that these goals can be accomplished through all sorts of platforms (Twitter, StackOverflow, Medium, GitHub). For now, I am focusing on this blog + GitHub.

Building a portfolio

Another mentor of mine is a leading data scientist from the Seattle start-up scene. Over some coffee one Sunday morning, she highly encouraged me to build up a portfolio of at least 3 well-curated, beautiful data projects. This felt like great, actionable advice: however, I was (and still am) a little lost on how to get to a “beautiful” portfolio.

My hope is to document and share my journey towards beautiful data projects in a way that others can either follow along with, or cherry-pick the bits that work for them.

Footnotes

  1. http://cs.stanford.edu/degrees/undergrad/Tracks.shtml 

  2. Scott is a mentor of mine at Microsoft. He has a blog too. 


Me

Nadja does not particularly enjoy writing about herself.