ITECH 2201 Cloud ComputingSchool of Science, Information Technology & EngineeringWorkbook for Week 6Part A (4 Marks)Exercise 1: Data Science(1 mark)Read the article at http: //datascience. berkeley. edu/about/what-is-data-science/ and answer the following: What is Data Science? There exists no consensus on what the concept of data science means as the term means different these to different people. Nonetheless, data science refers to an emerging interdisciplinary field that can be found at the intersection of the several disciplines including social science, statistics, design, information and computer (Datascience@berkeley, n.d. ). According to IBM estimation, what is the percent of the data in the world today that has been created in the past two years? IBM estimate that 2.5 quintillion of data is create every day, which is unprecedented rate that a full 90 percent of all the data that exist in the world today has been generated in the last two years.
This data originate from diverse sources. ____________________________________________________________________________What is the value of petabyte storage? The value of petabyte storage is 1024 terabyte or one million gigabytes For each course, both foundation and advanced, you find at http: //datascience. berkeley. edu/academics/curriculum/briefly state (in 2 to 3 lines) what they offer?
Based on the given course description as well as from the video. Research Design and Application for Data and Analysis is a foundation course that offers skills in how to apply disciplined, creative methods to enable them advance better questions, collect data efficiently, interpret findings, and present the findings to different audiences. Data Science W203: Exploring and Analyzing Data introduces learners to several quantitative research methods and statistical techniques employed in data analysis and covers inferential statistics, sampling, experimental design, tests of differences, measurement, and general linear models. Data Science W205: Storing and Retrieving Data covers data storage, management and retrieval necessary in analysis.
It aims at providing learners with theoretical knowledge and practical experience to help students master bid data management, storage and retrieval. Data Visualization and Communication is a foundation course in which students learn how to communicate patterns found in data clearly and effectively. It focuses on design and implementation of complementary of both visual and verbal representations of analyses in presenting results, responding to questions, and driving decision. Data Science W251: Scaling Up!
Really Big Data offers the students with an overview of the complementary toolkits for solving problems associated with big data and cloud computing. Date Science W231. Behind the Data: Humans and Values is an advanced course that introduces the students to legal, policy, and ethical implication of data, and covers related issues including data privacy, surveillance, security, classification, and discrimination, among others. Experiments and Causal Inference is an advanced course offering skills in experimental design, statistical analysis, communication findings, cleaning data and mining and exploring data. Also, it introduces learners to experimentation and designed based inference. Data Science W271.
Applied Regression and Time Series Analysis is an advanced course that offer skills in application of more advanced methods derived from regression analyses and time series models. It stresses on selection, application, and implementation of statistical techniques to detect significant patterns and develop insights from the data.