How to get insights from our dataset without writing code?

Data scientist spend most of their time (about 50% to 80%) cleaning, preparing and organizing data.

There are many tools in the market to achieve this, however I’ll show you one of the most powerful tools that I’ve ever seen.

Wellcome AWS Glue Dat…


This content originally appeared on DEV Community and was authored by Sergio Kaz

Data scientist spend most of their time (about 50% to 80%) cleaning, preparing and organizing data.

Time consuming for DS

There are many tools in the market to achieve this, however I'll show you one of the most powerful tools that I've ever seen.

Wellcome AWS Glue DataBrew

AWS Glue DataBrew

AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning.

Why is so powerful ?

Because, you can clean, prepare and organize your data at scale only paying per amount of information and time spending.

Step by step using DataBrew to get insights

Prerequisites

  • AWS Account

Create a bucket and upload your dataset

New bucket

You can create a new bucket clicking here.

Once you create the bucket, you must to upload a dataset: this is the dataset which I'm using for this demo. Here

Set up the dataset on [DataBrew](https://us-east-

1.console.aws.amazon.com/databrew/home)

First we need to connect your Dataset to DataBrew

Connect your dataset

Here, you have different kind of ways to connect to your dataset. For this demo, we use Amazon S3.

Now, you have to select your S3 Bucket (that you created before) and select the dataset.

Select the dataset

After that, click on Create

Run data profile

Once, you have your connection, select your dataset and click on Run data profile

Run data profile

There your are going to see, differents options like, number of rows that you want run the job, output file, etc.

At the end of the form, you are going to see a section named Permissions

Permission section

There you must to select, Create new IAM role, fill the role name and click on Create and run job

Wait until the job finish

In the job section (Profile jobs), you'll see something like that:

Working

When the job finish, click on View data profile and you'll see something like that:

Summary

Summary of the dataset and the correlation between variables

Value distribution

Value distribution

Columns summary

and columns summary!!

Well, there are much more insights that you can get with DataBrew, this is a short introduction.


This content originally appeared on DEV Community and was authored by Sergio Kaz


Print Share Comment Cite Upload Translate Updates
APA

Sergio Kaz | Sciencx (2022-06-24T19:56:29+00:00) How to get insights from our dataset without writing code?. Retrieved from https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/

MLA
" » How to get insights from our dataset without writing code?." Sergio Kaz | Sciencx - Friday June 24, 2022, https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/
HARVARD
Sergio Kaz | Sciencx Friday June 24, 2022 » How to get insights from our dataset without writing code?., viewed ,<https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/>
VANCOUVER
Sergio Kaz | Sciencx - » How to get insights from our dataset without writing code?. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/
CHICAGO
" » How to get insights from our dataset without writing code?." Sergio Kaz | Sciencx - Accessed . https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/
IEEE
" » How to get insights from our dataset without writing code?." Sergio Kaz | Sciencx [Online]. Available: https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/. [Accessed: ]
rf:citation
» How to get insights from our dataset without writing code? | Sergio Kaz | Sciencx | https://www.scien.cx/2022/06/24/how-to-get-insights-from-our-dataset-without-writing-code/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.