Thursday, October 3, 2019

What are the core skills of Data Scientist?


  Being able to accurately describe the problem from the vague concept and propose a framework for understanding the problem

For a newcomer just getting started, maybe many of the problems you have encountered are clear. For example, in the next year, which country has the most potential for user growth; tablet users spend the most time in what place. However, as your skills deepen, you will become more and more exposed to more ambiguous problems, and sometimes even feel that the problem itself is too virtual to start. For example, can we start to promote this new product? How to measure whether we have achieved success?At this time, the first point is to be able to break the problem into several specific small problems. For the question "Can we start to promote this new product?", we may want to know: Does the product quality itself meet the preset standards? What is the most important use characteristic of the user in the early stage of the product? Is it a feature that can be promoted? If we promote the product and get the X user, how many users will be left in a few months?

 The ability to answer questions from different angles and have trade-offs

Perhaps no one method is 100% correct or can give 100% of the answers to the questions, but a good data analyst can give data of different dimensions, sum up the stories and give the most likely answers.Continuing with the previous example, what are the most important features of the user in the early stage of the product? Is it a feature that can be promoted?In addition to looking at the user's usage data for this product, you may want to look at the user's usage data in other competing products. Maybe you want to look at some market data to determine the market size and market demand. Maybe you want to look at the user itself. Attributes (age, education, gender, place of residence or main social circle), maybe also want to see the changes after the user uses the product...There are so many things you can see, and it's easy to get lost in an endless curve. But what are the most important ones?

 Believe in intuition, but not blindly believe; before data analysis, dare to make their own guesses, but objectively accept the various possibilities presented by the data, and rationally 
choose the most likely outcome

I have encountered a lot of data analysis that I deliberately pieced together to tell a story. In this case, I am always angry. If you have determined that Product A will be better than Product B, why do you have to do everything possible to prove this with data? The goal of data analysis is to let you rationally compare A and B to help you make the right choices, rather than letting you affirm your guess and persuading others to follow you. The latter is just one result of data analysis
In the era of big data, what occupations are more popular? The answer can be found in this year's list of school salary recruits – algorithmic engineers, artificial intelligence researchers, data analysis and other positions. In fact, there is a certain intersection between these positions, that is, a large amount of data needs to be processed, especially as a data scientist. The main work is on processing data and analyzing data, and some work overlaps with algorithm engineers and artificial intelligence researchers. Its advantage is that it is more sensitive to data. So what are the skills that should be available as a data scientist? This article will give a glimpse of what.

      Academic
   data scientists generally have a high degree of education - 88% of data scientists are at least master's degree, 46% of data scientists are doctorates, which indicates that wanting to become a data scientist requires a very good educational background (knowledge understanding ). Common majors are computer science, social sciences, physical sciences, and statistics. The most common areas of research are mathematics and statistics (32%), followed by computer science (19%) and engineering applications (16%). The expertise you've learned while pursuing these degrees will provide you with the skills you need to process and analyze big data.        Can you sit back and relax after you have earned your degree? The answer is no, now is the era of lifelong learning. In fact, most data scientists continue to use online training to learn how to use special skills such as Hadoop or big data queries after they have a master's or doctoral degree.

 R programming language

 For data scientists, the R language is usually the preferred programming language. The R language is specifically designed for data science needs, and the R language can be used to solve any problems encountered in data science. In fact, 43% of data scientists are using the R language to solve statistical problems.        But there is a hindrance when learning R language, that is, if you have mastered another programming language, it is very painful to learn. Despite this, there are many R language learning resources on the Internet, such as Simplilearn's data science training and R programming language . Technical Skills: Computer Science

Python Programming

The Python language has been very popular lately. With the development of artificial intelligence and deep learning, Python has surpassed the Java language to become the most commonly used language in programming. Python is also a common coding language in data science. According to the survey, 40% of respondents use Python as their main programming language.       Because of the versatility of Python, it can be used for all steps involved in the data science process. For example, Python can use data in a variety of formats and can easily import SQL tables into your code. In addition, it allows you to create data sets.


10 comments: