Data Science Fundamentals for Python and MongoDB Book FULL FREE PDF 2022.
byDaoued-
0
Data Science Fundamentals for Python and MongoDB
Book FULL FREE PDF 2022.
Python users in data science:
The Python Software Foundation (PSF) and JetBrains have just published a major survey on Python users in 2018.
This survey based on over 20,000 responses attempts to draw a picture of Python developers.
The raw data is made available. In this article my ambition is to get a similar photograph but focusing on users on the side of data science.
Python users in data science:
The purpose of this article is to analyze a subcategory of Python users of the PSF survey, those who use Python to process data.
Methodologically, we extracted the raw data from the survey and performed analyses on users whose main use of Python was data analysis or machine learning.
This sub-category represents 28% of respondents, which allows us to have a sample of 4,585 observations.
The first interesting information is that 84% of respondents use Python as their main language, which is exactly the same proportion as in the complete survey.
The other languages:
The languages used by Python users in data science in addition to Python are quite close to all users, we notice especially differences on languages such as R or Scala for which there are significantly more users.
Conversely, users of the "web" language are rarer.
The use of Python 3 is extremely developed among Python users in data science with 90% of Python 3 users.
This seems logical given the recent appearance of Python in data science and the large number of new projects.
How do we install Python in data science?
If we go into more details, of the Python tools used, the PSF survey focuses on the tools used to install Python and not surprisingly, we see that Anaconda is much more present than other Python users.
Similarly, to create environments, Python users in data science prefer Anaconda. Nevertheless VirtuaEnv remains the most used.
What packages are used?
Logically enough, it is the data science packages that are popular.
NumPy with 89% users
Pandas with 81%,
Scikit-Learn with 66%
Give me 50%.
On the web framework side, Django is less well adopted by data science-oriented users. On the other hand, Flask is more used among data science-oriented users than others (47% vs. 45%).
What FDI?
IDE is a very important point in Python. Unlike R, for which RStudio is the absolute reference, there is no consensus on Python. It turns out that no tool exceeds 15% of users which is very low.
PyCharm remains in the lead, VS Code and Vim are the other two "classic" tools. Jupyter Notebook comes in 2nd position in the world of data science which is really a specificity.
Who answered this survey?
Only 45% of respondents consider themselves developers (compared to 73% in the full survey). Of course, there are many more data analysts than in the full study.
As for teamwork, these users work more alone (53% versus 48%). They’ve been in the computer business for less time. Finally in terms of age, we find more than 20-29 years in this subcategory (45% vs. 39%) and less "very young".
How does MongoDB work?:
MongoDB stores data objects in collections and documents rather than tables and lines, such as traditional relational databases. Collections contain multiple documents and act as the equivalent of relational database tables. Documents include several key-value pairs and form the database unit in MongoDB.
The structure of a document can be changed by simply adding or deleting existing fields. Documents can define a primary key as a unique identifier, and values can be various data types, including other documents, tables and document tables.
How does MongoDB text search work?
One of the main features of MongoDB is the text search, which allows you to query string fields to find a specific text or words. The text search can be done using the text index or the $text operator.
The text index can be a string or an array of string elements. To query the data via text search, the collection must contain a text index. The collection can contain only one text index, and the same text index can be applied to several fields.
You can also search a collection with a text index, using the $text operator. The $text operator marks each search string with a white space and processes all punctuation except for the “-” and “-” separators. After marking the search string, the operator performs the OR logical operation on the marks.
Conclusion:
This more specific study of a subpopulation of the PSF survey allows us to understand a little better the profile of Python users in data science. Even if this study retains the collection biases of the initial survey notably the high rate of Linux users which seems to exaggerate if we place ourselves in the context of python users in enterprise, This is a step in the direction of understanding users and thus of setting up responses to their needs.
keywords: machine learning, machine learning is, python machine learning,machine learning modeling, andrew ng machine learning ,
ai learning , aws machine learning, supervised learning ,unsupervised learning, ai ml, deep learning ai, tensorflow, data analytics, master's in data science, online master's data science, data analytics degrees, data science degrees, certified data scientist, master's in data analytics online , ms in data science, datascience berkeley ,uc berkeley data science, data science for managers, data science for beginners, certified data scientist, data science for all, big data analyst, r for data science, pandas, keras,tensorflowjs,hands on machine learning.
DOWNLOAD THIS EBOOK FREE PDF!