Daniel Deng
2015-03-26 10:43:31 UTC
Hi,
I am interested in the data exploration idea for AcousticBrainz and have drafted some ideas and questions on how I would go about it. I would like to know if it seems like I have the right idea of what this project actually entails.
Thanks,
Daniel Deng
Summary:
Create an application that allows users to select music and generate various charts from a selection of filters, orderings, and groupings.
This project will consist of 3 main tasks.
1. Generate a database that combines relevant information from MusicBrainz and AcousticBrainz.
2. Create a database application that manages the âLibraryâ of music that the user has selected, returns the data needed for the chart requested by the user, and allows for updates by some administrator.
3. Create a front-end that interacts with the database application
A couple of questions I have are-
How will this be integrated within the current MusicBrainz ecosystem? Will it be a desktop application like Picard or will it be web-based?
How is the AcousticBrainz data currently stored? Is stored using some DBMS or is it just a collection of JSON files currently?
Additional details if interested:
Database
The database will consist of three entities: tracks, artists, and releases. All of these will be associated with the corresponding MBID on MusicBrainz. The purpose of having an artist and release entity is to make it easier to select all the tracks associated with one of them. Each track will contain the âsummaryâ information from the low-level data, such as average_loudness and key_strength, and all the categorical values from the high-level data (probabilities will be excluded).
User Application
There will be three main steps to make a chart: creating the library of tracks to explore, processing the library, and choosing the desired chart.
First, the user will create the library by entering search strings for artists, releases, tracks, and tags. If this query is ambiguous, the user will be prompted to select among a list of possibilities. Alternatively, the user can just enter the MBID corresponding to what he wants. When an artist, release, or tag is entered, all the tracks corresponding to it are added to the library. Selecting the entire collection of tracks will also be possible.
Next, four processors are provided for the user. They are filtering, ordering, grouping, and attribute of interest. For filtering, the user can have tracks with specified attributes to be removed. These attributes can be combined with AND or OR operators. For ordering, the user can specify some attribute to order the tracks by. For grouping, the user can specify some attribute that splits up the remaining tracks. For attribute of interest, the user can specify what attribute he wants to place on the chart (this could just be a count of occurrences).
Finally, the desired chart is selected and some options could be a pie chart, bar graph, or histogram.
Database application
I donât really think there is too much to say about this, but I may be wrong. Generating the SQL statements shouldnât be too complicated, as the user application is rather restrictive. Essentially the first three processors, filtering, ordering, and grouping correspond to WHERE, ORDER BY, and GROUP BY.
I am interested in the data exploration idea for AcousticBrainz and have drafted some ideas and questions on how I would go about it. I would like to know if it seems like I have the right idea of what this project actually entails.
Thanks,
Daniel Deng
Summary:
Create an application that allows users to select music and generate various charts from a selection of filters, orderings, and groupings.
This project will consist of 3 main tasks.
1. Generate a database that combines relevant information from MusicBrainz and AcousticBrainz.
2. Create a database application that manages the âLibraryâ of music that the user has selected, returns the data needed for the chart requested by the user, and allows for updates by some administrator.
3. Create a front-end that interacts with the database application
A couple of questions I have are-
How will this be integrated within the current MusicBrainz ecosystem? Will it be a desktop application like Picard or will it be web-based?
How is the AcousticBrainz data currently stored? Is stored using some DBMS or is it just a collection of JSON files currently?
Additional details if interested:
Database
The database will consist of three entities: tracks, artists, and releases. All of these will be associated with the corresponding MBID on MusicBrainz. The purpose of having an artist and release entity is to make it easier to select all the tracks associated with one of them. Each track will contain the âsummaryâ information from the low-level data, such as average_loudness and key_strength, and all the categorical values from the high-level data (probabilities will be excluded).
User Application
There will be three main steps to make a chart: creating the library of tracks to explore, processing the library, and choosing the desired chart.
First, the user will create the library by entering search strings for artists, releases, tracks, and tags. If this query is ambiguous, the user will be prompted to select among a list of possibilities. Alternatively, the user can just enter the MBID corresponding to what he wants. When an artist, release, or tag is entered, all the tracks corresponding to it are added to the library. Selecting the entire collection of tracks will also be possible.
Next, four processors are provided for the user. They are filtering, ordering, grouping, and attribute of interest. For filtering, the user can have tracks with specified attributes to be removed. These attributes can be combined with AND or OR operators. For ordering, the user can specify some attribute to order the tracks by. For grouping, the user can specify some attribute that splits up the remaining tracks. For attribute of interest, the user can specify what attribute he wants to place on the chart (this could just be a count of occurrences).
Finally, the desired chart is selected and some options could be a pie chart, bar graph, or histogram.
Database application
I donât really think there is too much to say about this, but I may be wrong. Generating the SQL statements shouldnât be too complicated, as the user application is rather restrictive. Essentially the first three processors, filtering, ordering, and grouping correspond to WHERE, ORDER BY, and GROUP BY.