Being able to accurately describe the problem from the vague concept
and propose a framework for understanding the problem
For a newcomer just getting started, maybe many of the
problems you have encountered are clear. For example, in the next year, which
country has the most potential for user growth; tablet users spend the most
time in what place. However, as your skills deepen, you will become more and
more exposed to more ambiguous problems, and sometimes even feel that the
problem itself is too virtual to start. For example, can we start to promote
this new product? How to measure whether we have achieved success?At this time,
the first point is to be able to break the problem into several specific small
problems. For the question "Can we start to promote this new
product?", we may want to know: Does the product quality itself meet the
preset standards? What is the most important use characteristic of the user in
the early stage of the product? Is it a feature that can be promoted? If we
promote the product and get the X user, how many users will be left in a few
months?
The ability to answer questions from different angles and have
trade-offs
Perhaps no one method is 100% correct or can give 100% of
the answers to the questions, but a good data analyst can give data of
different dimensions, sum up the stories and give the most likely
answers.Continuing with the previous example, what are the most important features
of the user in the early stage of the product? Is it a feature that can be
promoted?In addition to looking at the user's usage data for this product, you
may want to look at the user's usage data in other competing products. Maybe
you want to look at some market data to determine the market size and market
demand. Maybe you want to look at the user itself. Attributes (age, education,
gender, place of residence or main social circle), maybe also want to see the
changes after the user uses the product...There are so many things you can see,
and it's easy to get lost in an endless curve. But what are the most important
ones?
Believe in intuition, but not blindly believe; before data analysis,
dare to make their own guesses, but objectively accept the various
possibilities presented by the data, and rationally
choose the most likely
outcome
I have encountered a lot of data analysis that I
deliberately pieced together to tell a story. In this case, I am always angry.
If you have determined that Product A will be better than Product B, why do you
have to do everything possible to prove this with data? The goal of data
analysis is to let you rationally compare A and B to help you make the right
choices, rather than letting you affirm your guess and persuading others to
follow you. The latter is just one result of data analysis
In the era of big data, what occupations are more popular?
The answer can be found in this year's list of school salary recruits –
algorithmic engineers, artificial intelligence researchers, data analysis and
other positions. In fact, there is a certain intersection between these
positions, that is, a large amount of data needs to be processed, especially as
a data scientist. The main work is on processing data and analyzing data, and
some work overlaps with algorithm engineers and artificial intelligence
researchers. Its advantage is that it is more sensitive to data. So what are
the skills that should be available as a data scientist? This article will give
a glimpse of what.
Academic
data scientists generally have a high degree of
education - 88% of data scientists are at least master's degree, 46% of data
scientists are doctorates, which indicates that wanting to become a data
scientist requires a very good educational background (knowledge understanding
). Common majors are computer science, social sciences, physical sciences, and
statistics. The most common areas of research are mathematics and statistics
(32%), followed by computer science (19%) and engineering applications (16%).
The expertise you've learned while pursuing these degrees will provide you with
the skills you need to process and analyze big data. Can you sit back and relax after you
have earned your degree? The answer is no, now is the era of lifelong learning.
In fact, most data scientists continue to use online training to learn how to
use special skills such as Hadoop or big data queries after they have a
master's or doctoral degree.
R programming language
For data scientists,
the R language is usually the preferred programming language. The R language is
specifically designed for data science needs, and the R language can be used to
solve any problems encountered in data science. In fact, 43% of data scientists
are using the R language to solve statistical problems. But there is a hindrance when learning
R language, that is, if you have mastered another programming language, it is
very painful to learn. Despite this, there are many R language learning
resources on the Internet, such as Simplilearn's data science training and R
programming language . Technical Skills: Computer Science
Python Programming
The Python language
has been very popular lately. With the development of artificial intelligence
and deep learning, Python has surpassed the Java language to become the most
commonly used language in programming. Python is also a common coding language
in data science. According to the survey, 40% of respondents use Python as
their main programming language.
Because of the versatility of Python, it can be used for all steps
involved in the data science process. For example, Python can use data in a
variety of formats and can easily import SQL tables into your code. In
addition, it allows you to create data sets.