This blog post will give a quick review of all the questions that were discussed in our Microsoft Azure Data Fundamentals [DP 900]. The Azure DP900 Certification is for all those who are looking forward to starting working with or shifting their focus to Azure Cloud Data Services for various data-related tasks.
The previous week, In the Day 2 session we got an overview of Relational Data Services in Azure, Query Relational Data in Azure.
Microsoft Azure DP900 Certification gives a holistic overview of the most common services. And it covers some non-relational data solutions in Azure are the Azure Storage Account (table, file, and blob storage), Azure Cosmos Database, Azure Data Lake Storage.
We covered the following Modules in the Azure DP900 Day 3 Session:
- Module 03: Explore Non-Relational Data in Azure
Here are the questions that we discussed in the Azure DP900 Day 3 Session:
> Module 03: Explore Non-Relational Data in Azure
This module covered all non-relational data offerings in Azure, how do they help in developing applications using non-relational data, and how to provision them. Some popular non-relational data solutions in Azure are the Azure Storage Account (table, file, and blob storage), Azure Cosmos Database, Azure Data Lake Storage. We saw practically how to upload, download and query data within these non-relational datastores.
Q1: What is an Azure Storage Account?
A: Azure Storage Account is a Microsoft-managed cloud service that provides storage that is highly available, secure, durable, scalable, and redundant. Whether it is images, audio, video, logs, configuration files, or sensor data from an IoT array, data needs to be stored in a way that can be easily accessible for analysis purposes, and Azure Storage provides options for each one of these possible use cases.
The Azure Storage Platform includes these services: Blobs (Containers), File Shares, Queues, Tables, Data Lake, and Disks (for general-purpose version 2 Storage Accounts only)
Also Read: Structured vs Unstructured Data, to know the major differences between them.
Q3: When to use Azure Storage Account and Data Lake Store?
A: We use Azure Storage Account when we want to store non-frequently accessible, general-purpose data in a flat namespace. Use Azure Data Lake Stores when you want to store data in a hierarchical namespace (files within folders) and want to use the data for data analytical (Big Data) workloads – processing, transformation, or Machine Learning tasks.
Q4: What are Azure Storage File Shares?
A: Offers fully managed cloud file shares that you can access from anywhere via the industry-standard Server Message Block (SMB) protocol. You can mount Azure file shares from the cloud or on-premises deployments of Windows, Linux, and macOS.
Source: Microsoft
Q5: What is the maximum data hierarchy allowed in Azure Data Lake Store?
A: There are no limitations as such for the number of hierarchies allowed in Azure Data Lake Stores. We can go up to any data level.
Q6: Azure Block Blob is a normal file system or a special type of file system?
A: Azure Block Blob stores unstructured data – such as text and images. It appears as a normal on-premise file system to the user but at the backend, it stores data in a special way to scale huge amounts of data. Block blobs are made up of blocks of data where each block can store huge amounts of data.
Here are some scenarios where we use Azure Block Blobs:
- For applications to access images or documents
- File sharing for distributed access
- Streaming video and audio from the Azure Cloud
- Managing log files or for data backups and archiving for restoring later.
Q7: Does the Azure Block Blob import structured data in the Azure container?
A: Azure Block Blobs stores text data or binary data.
Q8: What is typically the most useful kind of Azure Blob?
A: Azure Blob Storage supports three types of blobs, each of which is good for their own designated use and purpose:
- Block Blobs: They store data in multiple blocks that are referred by their unique block ID and can store huge amounts of text or binary data. We can manage large-sized data in a network using Block Blobs.
- Append Blobs: They also hold data in multiple blocks, but here the blobs do not show their unique block ID and once created, cannot be updated or deleted. They are used for appending data. Hence a new data block in the append blob is appended at the end of the blob. They are used for storing logging data from other resources.
- Page Blobs: They are the default type and used for page compilation. Page Blob is used for conducting a read and write operation. Data of the virtual machines are stored as virtual documents and files on the page blob. It can hold up to 8 TB of data. For the virtual machine, it acts as disks.
Q9: Can you explain Azure database replication?
A: Azure offers replication of your database data for increased availability and disaster recovery. Though Azure handles data replication for all resources at the backend, for Azure Databases we can select what type of data replication we would like to have – on the basis of data security policies, application use case, and other parameters. The Standard Geo-Replication option replicates committed database transactions asynchronously from the primary (original) database to the secondary (replicated) database in a predefined Azure region.
Q10: Does database replication allow an active-active model (both regions have the data-write capability)?
A: In Azure, database replication will follow an active-passive model in general. Azure Cosmos DB allows an active-active model to the users.
Q11: Do all tiers and types of Storage Accounts support Geo-Replication?
A: Yes, all tiers and types of Azure Storage Account support the following Geo-Replication options:
- Geo-redundant storage (GRS): This option enables you to create a non-readable and asynchronous replica of your primary region data into a secondary region.
- Read-Access Geo-Redundant Storage (RA-GRS): This option allows you to create readable GRS replicas
- Geo-zone-redundant storage (GZRS): This option creates three replicas, one each in three Availability Zones in the primary region, then creates an asynchronous non-readable replica of the data in the secondary region.
- Read-Access Geo-Zone-Redundant Storage (RA-GZRS): You can create readable GZRS replicas using this option.
Q12: How many data backups are available in Azure Databases? Can we get any of them?
A: Azure PaaS Database offerings take automatic database backups multiple times. We cannot access the backup files since they are managed by Azure itself in PaaS solutions.
Source: Microsoft
Q14: Is there any provision of data backup for Azure Cosmos DB?
A: Yes, since Azure Cosmos DB is a PaaS offering, backups are taken and restored automatically. We cannot take manual backups nor access the backup files. But we perform a manual restore.
2) Gaming:
Q16: What is the difference between the Azure Table and a normal table in the Azure SQL Database?
A: SQL stores relational databases whereas Azure Tables stores the non-relational database of massive sizes.
Q17: In AWS, we create an IAM role for Security. How do we do it in Azure?
A: In Azure Security Services, the Identity and Access Management (IAM) service is under the Azure Active Directory (AAD) service. The Azure IAM is an access management service that is also called role-based access control (RBAC). It allows the administrator to assign certain roles to the users – owner, reader, and contributor which helps them manage to what extent can a user work with a particular resource.
The Azure Active Directory is the fundamental identity management service which is a live directory storing user accounts, their passwords, and their access rights to various Azure Subscriptions.
Azure AD allows users to access their subscriptions while Azure IAM allows users to work with the resources within those subscriptions.
Q18: What is Azure Active Directory?
A: The Azure Active Directory or Azure AD is Microsoft’s Identity and Management Service (IAM) It is used by Azure Administrators to manage access to both internal and external network resources. It also helps to grant or revoke permissions to users to use those resources.
Q19: Are the names of the objects we create unique in RG or region?
A: There should be a unique name for creating a storage account only not any other services require a unique name in the resource group or region.
Q20: What’s the purpose of a partition key?
A: In Azure, the partition key is a property that will exist on every single object that is best used to group similar objects together.
Q21: Difference between primary key and partition key?
A: The partition key is the key for the hash table, the primary key is the one or more columns that uniquely identify a record in the table.
Q22: Blob storage is used for unstructured data?
A: Yes, blob storage is used to store unstructured data. Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn’t adhere to a particular data model or definition, such as text or binary data.
Q23: Is indexing is only the reason for fastness in Cosmos DB?
A: Yes, In Azure Cosmos DB, every container has an indexing policy that dictates how the container’s items should be indexed.
Q24: Unknowingly if we write any long-running queries, do we get any alert/recommendation to improve the query approach?
A: You can use diagnostics logs to identify queries that are slow or that consume significant amounts of throughput.
Q25: I want to populate the AdventureWorks DB, not the LT version. How can I do that without using SSMS or Azure Data Studio?
A: You can use Azure CLI.
Q26: In the partition key, what type of attributes should be selected? Should we partition based on productid or let say we have scaling data for product dimensions, can we partition on based on product length, width, etc.
A: A good partition key for distributing customers might be the customer number since it is different for each customer. A poor partition key might be their zip code because they all live in the same area nearby the bank. The simple rule is that you should choose a Partition Key that has a range of different values.
Q27: Practical use case (industry) for use of Table storage, please?
A: For any data that is de-normalized we can use table storage.
Q28: Can the cosmos DB store relational data in the form of tables and records?
A: No Cosmos DB stores the data in the form of documents similar to JSON files.
Q29: How would I decide which API is used when?
A: Azure Cosmos DB provides different APIs to access and interact with the data it stores. Your default choice for new Azure Cosmos DB accounts should be Core (SQL). However, you should also consider the following situations:
If your data is better represented in a graph, then the Gremlin (graph) API might be a good choice.
If you already have an existing application or database that is using one of the other APIs, then the current API might be a better choice for your specific scenario. Using the current API might make it easier to:
- Migrate your application or database to Azure Cosmos DB
- Reuse your existing code with minimal changes
- Leverage the existing knowledge and experience of your development team.
You should only use the Azure Table API if you are migrating from Azure Table Storage, as Core (SQL) offers far more features and flexibility.
Q30: Data Lake storage can be created by selecting Hierarchical, so it will come under which type File storage or Blob storage?
A: Built on Azure Blob Storage.
Q31: Any option to move data from on-prem to cosmos DB?
A: In order to support migration paths from the various sources to the different Azure Cosmos DB APIs, there are multiple solutions that provide specialized handling for each migration path. , You can use ADF or synapse pipeline.
Quiz Time (Sample Exam Questions)
With our Microsoft Azure Data Fundamental Program, we cover Over 150+ Sample questions to help you prepare for the Certification [DP-900]
Check out these Questions:
References
- Microsoft Certified Azure Data Fundamentals | DP 900 |
- Microsoft Azure Data Fundamentals [DP-900]: Step By Step Activity Guides (Hands-On Labs)
- Azure Storage Accounts Overview & Steps To Create
Next Task For You
In our Azure Data Engineer training program, we will cover 40 Hands-On Labs. If you want to begin your journey towards becoming a Microsoft Certified: Azure Data Engineer Associate by checking our FREE CLASS.
Leave a Reply