You have a large CSV file. It's a lot of raw data separated by commas.
You wonder - “What do I do with it? How do I get my hands on the information I’m looking for?”
You have three options:
What is Gigasheet? Well, let us to show you! Throughout this blog post, we’ll use Gigasheet to analyze a BIG CSV. We will be looking at a IMDB Movie Reviews CSV, which you can find in our Data Community, where we share files and share insights about public data sets.
Here are the following are the sections that we’ll be covering:
We’re so excited to share this with you. Let’s dive in!
First, we logged in to Gigasheet. Haven’t signed up yet? Click HERE to do so. Upon logging in, we created a copy of the IMDB dataset we fetched from our Data Community.
However, it's really easy to import your own CSV file from your Gigasheet dashboard. To do that, click on “+ NEW” as displayed in the screenshot, and select “File Upload.”
You can drop the file in directly from your Desktop or browse your computer and select the file to Gigasheet. Or you can import from popular cloud storage sites such as Google Drive:
As mentioned, we created a copy of the IMDB dataset which includes over seven million shows and movies - a BIG CSV. Here’s what the dataset looks like in Gigasheet:
Let’s look up a show called “The Arrival of a Train.” To do so, we’ll use the Search feature in Gigasheet. Just type in the term into the search box.
As soon as we clicked on the search button, Gigasheet provided us with the results in mere seconds.
Well what if the results aren't on page 1? Let's search for “The Prince of Darkness" to see what happens. It's on Page 5.
And here are the results. Gigasheet automatically scrolls to the appropriate page when using the up and down arrows that appear in the search box.
With Gigasheet’s search feature, you can find your data in mere seconds. Now, let’s try to find this entry using the Filters feature.
Again – let’s try to find “The Prince of Darkness.” But this time, we’ll use the Filters feature.
We’ll apply the following filter:
As soon as we click on “Apply,” we’ll get the following results:
As simple as that! Filters removed every row except the term that we were interested in.
Now, let’s dive a bit deeper into the Filters feature. Let’s say you want to look up short movies or TV shows. So, we’ll set the Genre = Short. Here’s what our filter will look like:
We still have over 150k entries upon applying this filter. 150,196 to be exact!
Now, let’s apply a second filter on top of this one, to narrow our results. Let’s say you want to find short movies or TV shows that started or were launched in 1900. So, we’ll add a filter for StartYear:
Now we are down to a much more manageable list of 212 entries:
You can add as many AND / OR conditions to filters to get your hands on the data that you want! You can also save filters for future use, making Gigasheet the best tool for filtering big CSV data!
What if we told you that you can also group your data by a specific column for better navigation? That’s right. We allow our users to group their data by column. Let’s have a look at how to do that.
Wondering how to group your data with Gigasheet? To do that, click on “Group.”
And then let’s group our data by Genres (we have multiple genres!). Groups take all of the entries in a column and roll up the data into a group for each value.
Here’s what grouped data looks like:
As you can see, we have over 2,232 unique values that now show up as rows, vs the 7M we started with. Now, let’s say we want to look up movies or TV shows with “Crime, Drama” as the genre. We have over 36,337 entries.
Upon clicking on a group, the group will expand and show all of the movies or TV shows with the genre “Crime, Drama.”
Gigasheet can handle groups within groups! Let’s say you want to add another group to the already existing grouping we just created: Group by “Genres’” and then by “Start Year.” So, here’s how we’ll set it:
Let’s expand the same genre “Crime, Drama.” Upon clicking on it, we saw it grouped by the “Start Year.” This is a powerful way to explore data.
Now, if we wanted to get our hands on “Crime, Drama” movies or TV shows from 1941, we expand the second group – 1941 - and get our list.
You can see how Gigasheet's Filters and Groups would allow you to find the proverbial needle in a haystack.
But we are not done! Alongside this, you can even use Pivot Mode – a powerful feature that adds vertical groups, and allows you to slice and group your data by just dragging things around.
First, for those who are not aware of what Pivot mode is – it’s a great way to group, splice, and summarize your data by dragging things around. Allow us to show you how!
First, we grouped our data by “Genres” and here’s the result:
Now, let’s turn on the Pivot mode in Gigasheet. You can do it by clicking on the toggle button in the right panel – as displayed in the screenshot below.
As you turn on the Pivot mode, you’ll see the “Column Groups option” as visible in the screenshot below.
Here, you can drag your columns to create groupings across the top of your screen. So, let’s say you want to find out how many comedy movies or TV shows were launched in 2022 or 2010. Maybe you want to find out how many documentaries were launched in 2022 or 2010.
So, we’ll add drag “Start Year” to the “Column Groups.” Also, we’ll drag “Primary Title” to the “Values” section – using which the values will be calculated.
So, here’s what our Row Groups, Values & Column Groups look like:
And here’s the data:
As you can see, 18,205 movies or TV drama movies or TV shows were launched in 2014. Whereas, in 2020, it was 20,460.
That’s how the Pivot Mode works in Gigasheet. Interesting, isn’t it? Let’s dive in a bit deeper. Let’s add another row group: Average Rating.
So first we’re grouping our data by Genres and then by Average Rating.
Here’s what our Pivot Mode looks like:
Our Row Groups – Genres & Average Rating.
Our Value (which is calculated) – Primary Title.
Our Column Groups – Start Year.
Upon grouping the data using the pivot mode, here’s what it looks like:
As you can see, 18,205 Drama movies or TV shows were launched in 2014, of which 85 had a 10-star IMDB rating, 37 had a 7-star rating, 48 had an 8-star rating, and so on.
That’s how you can really dive deep into the data. Love the Pivot Mode? We can’t wait for you to try it out. Sign up today!
Analyzing big CSVs has never been easier. With Gigasheet, you can not only load and access gigantic-sized CSVs but also play around with them to dive deep into your data.
Already in love with Gigasheet?