Categories
Technology

Normalize and Clean Data Quicker and More Efficiently with AWS Glue Databrew

No need to write code, AWS Glue Databrew is a visual interface that helps developers normalize and clean their data quickly.

AWS Glue is a service that helps developers save time by automating much of the data preparation when transforming and analyzing data within your ETL (extract, transform, load) jobs. Now with AWS Glue Databrew, you can take advantage of the nearly 80% time savings for analytics and machine learning related projects.

AWS Glue Databrew’s benefits include:

  • Filtering Anomalies
  • Standardizing Formats
  • Correcting Invalid Formats

Databrew offers over 250 pre-built transformations to automate the data preparation tasks like the ones above. You can also interactively discover, visualize, clean, and transform raw data using Databrew’s visual interface. To get started, create a project and connect your data source in the AWS Glue Databrew console.

Next, you will be able to visualize your data, and chose from the 250+ available point and click transformations. For example, you use NLP (natural language processing) to split sentences into phrases immediately with data transformations available through the interface. And after you run your transformations, the output is stored in an S3 bucket.

Resources mentioned in this article:

Dale Yarborough

By Dale Yarborough

I am a Software Engineer at General Motors and Appalachian State University Alum. Previously: Whole Foods Market IT, Charles Schwab