Exercise: Data

I) Multiple Choice Questions

  1. Which of the following is NOT a form of data?

    1. Text
    2. Numeric
    3. Audio/Visual
    4. Emotions
    5. Models

Answer:

  1. Quantitative data are measures of:

    1. Types
    2. Values or counts
    3. Categories
    4. Names
    5. Symbols

Answer:

  1. Which of the following is an example of continuous data?

    1. Number of books
    2. Gender
    3. Height
    4. Hair color
    5. Education level

Answer:

  1. Semi-structured data is best managed in:

    1. Relational databases
    2. NoSQL databases
    3. Data lakes
    4. Data warehouses
    5. Data marts

Answer:

  1. A healthcare provider collects patient data from various sources, including electronic health records, medical imaging, and wearable devices. Which V’s of Big Data apply to this scenario?

    1. Volume and Variety
    2. Volume and Velocity
    3. Variety and Veracity
    4. Velocity and Veracity
    5. Volume, Velocity, and Variety

Answer:

  1. A financial institution must ensure that the data used for fraud detection is accurate and trustworthy. Which V of Big Data is most critical in this context?

    1. Volume
    2. Velocity
    3. Variety
    4. Veracity
    5. All of the above

Answer:

II) Matching the following terms with their definitions:

Definition

A. Raw facts and figures that are collected from various sources and can be processed to extract meaningful information.

B. A small set of data that can be easily managed and analyzed using traditional data processing tools.

C. Process of extracting meaningful information from data

D. A field that solely focuses on the storage and retrieval of data without any analysis.

E. Structured information that has been processed and organized to provide insights.

F. A massive amount of data sets that cannot be stored, processed, or analyzed using traditional tools.

G. A multidisciplinary field that aims to produce broader insights by combining skills from statistics, computer science, and domain expertise.

Term 1. Data: Answer

  1. Big Data: Answer

  2. Data Analytic: Answer

  3. Data Science: Answer

Fill in the Blanks

  1. data are measures of ‘types’ and may be represented by a name, symbol, or a number code.
  2. A is a subset of an organization’s data that is usually created for a specific user group or department.
  3. is the programming language used to manage structured data.
  4. A data is a hybrid data storage architecture that combines the features of data warehouses and data lakes.
  5. data does not have a predefined data model and is best managed in non-relational (NoSQL) databases.
  6. The phase in CRISP-DM involves determining the business question, objective, and success criteria.

IV) Writing Answer

  1. Explain the differences between data analyst, data engineer, and data scientist. Provide the main role examples for each type.