Wednesday, January 2, 2013

Astronomer to Data Scientist

I recently made the transition from astrophysics researcher to data scientist for a tech company (Yammer / Microsoft). Below are suggestions for people in academia / research who are interested in pursuing a tech job.

Most tech companies are interested in smart, talented people who can learn quickly and have good problem solving skills. Scientists have these attributes. Therefore if you apply for jobs at tech companies, you'll likely get at least a response from a recruiter. However, once you get an interview, there are many other skills that the company will try to assess, skills that you may or may not have already.

Below are some tips which will help you both in the application / interview process, as well as on the job at a tech company. 

1) Learn a Standard Language
Sorry astronomers, but IDL isn't going to cut it if you want to get a tech job. You need to learn one of the industry-standard programming languages. Python, Ruby, Java, Perl, and C++ are all good languages to pick-up. It would also be good to learn a statistical analysis package like R, SAS, SPSS or Excel, as well as a visualization package to show your results. Some jobs, involve a coding interview. These require some knowledge of computer science algorithms. Look online, there are many example coding problems for you to practice. 

2) Learn About Databases
"Big data" is the Web 2.0 it-word. If you want to play with big data, you are going to need to learn how to manage, handle, and access it. SQL is a must. It would be great if you could also familiarize yourself with Hadoop/MapReduce and Hive. 

3) Brush-up Your Stats
Many of the tech interviews involve doing complicated math, probability, statistics, brain-teasers, and open-ended problems. Dust off some of your old statistics text books or pick up a book about data analysis using one of the above languages. Search online for past interview questions of the companies you are applying to. 

4) Communication is Key
To be effective in a tech job, not only should you be able to program, analyze data, and solve problems -- you need to easily explain your work to people who aren't very technical. Communication is incredibly important for these roles, and a huge part of the interview process is gauging how well you explain complicated ideas to a lay-person. There are many opportunities to practice this skill within academia, so give many talks, teach classes, tutor, volunteer, or do whatever you can to become very comfortable explaining technical ideas to people with different backgrounds and skill levels. 

5) Convert Your CV into a Resume
There is a difference, and it is important. People at tech companies get 100's of resumes. It is important to succinctly highlight the skills you bring to each job. It's great that you have published dozens of papers, given lots of talks, and taught many classes... but what is more important are the skills you acquired from those experiences. Resumes should only be 1-2 pages. Look at the skills required for the job you are applying for, and then try to demonstrate those skills by listing the relevant experience. 

6) Academic vs. Business Problems
In academia the goal is usually to get the most accurate solution possible. Time and efficiency are less important than doing something thoroughly and rigorously. In business the goal is to increase your company's value. Therefore any task must both optimize accuracy AND value. This is a difficult transition for many academics to make. Spend some time reading TechCrunch and other such sites to help familiarize yourself with the various metrics and problems that tech companies care about. Be prepared to work on short deadlines and to be able to prioritize tasks in order to increase the value of your work. Keep this in mind when answering open-ended interview questions such that you demonstrate your understanding of this difference. 

7) Do an Internship or a Project
The best way to get your foot in the door of a tech company is to do an internship. Many of the major tech companies have paid summer internships that will introduce you to this type of work, as well as teach you many of the above skills. The Insight Data Science Fellowship is an internship specifically designed for helping academics transition into tech positions. If you are unable to take the time off from your current job, then consider doing a project on your own. Create an application for your phone, or do a research project with one of the many free data sources out there. This will give you insight into the work you might do at a tech company, and an important set of talking points for interviews.

If you have more questions about making the transition from academia to tech or the tech interview process, feel free to contact me.


Anonymous said...

This is awesome btw. Concise, clear, and pure awesome. (Comment ported from Astrobetter).

Johanna Teske said...

What is the recommended course of action for meeting some of these recommendations? Just read books? Are there good online courses (I'm thinking for SQL in particular)? Thanks for a very helpful post!

Chris said...

I'm a recruiter for a hedge fund that employs dozens of physicist researchers (we're data driven) and this is very good advice. The only thing I would add is come prepared to explain your motivations for wanting to move to industry. It's important you can convince your future employer that you are moving for the right reasons.

berkeleyjess said...

Johanna, there are many (free) courses online which can be helpful for learning some of these skills, but I would highly recommend doing a project or internship over simply reading a book or doing a course.

Some good online sources for introducing you to these skills are outlined in this article.

I agree with Chris that you need to demonstrate an interest in whatever job you are applying for. This is why I suggested reading tech blogs if you want to go into tech. You should also have a clear and convincing answer to the question: "Why do you want to leave astronomy (or whatever your current field is)?"

I plan to follow up this post with another blog entry about the pros and cons of tech versus academia (in my experience). Watch this space.


hardtke said...

I moved from high energy physics to data science several years ago, and I'm surprised how difficult it is to recruit data scientists. It's a seller's market (in the Bay Area at least), even for those without industry experience. Our top candidates are getting multiple offers. The trick, from a job seeker perspective, is to realize how valuable your skill set is. I recently wrote a blog post to help people recruit from academia -- this gives you an idea what it is like from the other side: How to Hire A Data Scientist

Kyle said...

Great article Jess!

Regarding #1, for any astronomers still using IDL, it is getting easier and easier to switch to python. The astronomy python library astropy is quickly gaining functionality and is nearing a new release.

Even better, contributing to it is easy and is a good way to learn new skills (with python, git, and other tools) from a community of helpful people.

Unknown said...

This is spot on! I'm also a former astronomer turned data scientist. We learn a lot of skills in astronomy that can translate. The observational nature of astronomy lends itself well to the types of messy data you'll encounter in the real world. And astronomy's hands-on bottom-up analysis is a boon when you're able to code your own statistical procedures. But like you say, there is definitely a lot more to know. Luckily all these things can be learned for free online at places like Coursera in our free (ha!) time.

emil.rehnberg said...

This is a pretty awesome article even for a statistician that wants to transfer into "Data Science" :)
Thank you madam!

Anonymous said...

How might someone go in the opposite direction: from data scientist to astronomy research? From here it looks like the only path is grad school. What does it take to get involved without an advanced degree in physics?

Unknown said...

It's exciting to see 'hard' scientists migrating over to computer science!

Anonymous said...

@Anonymous data scientist:
Sorry, but for the most part it does require an advanced degree. A masters does it for some low-level jobs in astronomy, a PhD is needed for most. And there are already too many people with PhDs competing for these jobs (hence this post), so no one will consider someone without.

The one exception is some groups/collaborations hire specialty programmers and the like. We can all program, but most of us do it haphazardly as a means to an end, and for high-performance applications it can really pay to hire a professional programmer.

Other than that, I think you're out of luck. Every job I know that doesn't require an advanced degree is open just to students, and in addition pays crap (undergrads will work for minimum wage).

Anonymous said...

You also have to learn a lot about text processings, that is, regular expressions, character sets, encodings, etc

Anonymous said...

Johanna Teske, the Coursera course on databases is a good introduction for a pretty solid foundation in databases.

Unknown said...

I would suggest some reading on Machine Learning (an online course would be better). Chances are you already manipulated a large part of the concepts formalized in Machine Learning through data reduction analysis without ever noticing or being told so. Machine Learning is a great skill to sell for a job in data science as well.

Anonymous said...

Thank you for the encouragement! As a current astrophysics researcher hoping to eventually crossover into data science, this post makes the process feel less daunting. Quick question though: how valuable is it to complete a PhD before the transition? I've heard contesting opinions. Some say I should definitely complete the PhD because it will boost my entry-level (and overall) salary. Others say it's really not worth the time and I could instead be training (or finding an internship like you mentioned) instead. What are your thoughts?

berkeleyjess said...

There isn't one 'right' path to becoming a data scientist. The field is competitive, and I have found that it is hard to break in without some technical masters or PhD degree, or significant experience working with data, stats, and analytics toolsets.

Working at a tech company as an analyst or software engineer or doing an internship is one way to gain experience and get your foot in the door. Another way is to get a graduate degree. I don't think one is better or worse, it just depends on what you want to spend your time doing. I enjoyed graduate school, and I am glad I did it, but I don't think it was necessary to become a data scientist (at least not all of it).

A PhD will give you a higher starting salary, but not as high as you will get by entering the field earlier and working for the same amount of time. For instance at Yammer I had a coworker who was 5 years younger than me, and start working in tech right out of undergrad. He was at a more senior level and made more money than me even though I had more degrees than him. Experience > degrees in the tech field, I find.

People ask me frequently what I think about the 'data science masters' programs that are popping up recently. I honestly don't know much about them. I've never hired anyone out of one of those programs. I have mostly hired people from PhD programs or who are already data scientists/engineers/analysts. I do not know if students will learn something in those programs that they couldn't teach themselves. A lot of them are very new, and so it's hard to even know how effective they'll be at ultimately placing students in a job. Sometimes the main benefit of these types of programs is networking and connecting with companies that you wouldn't otherwise get access to, not the actually skill-sets being taught.

Unknown said...

Hi Jessica

I wanted to know did you take up any data science or computing course in a university.

Many thanks