Discussion:
[mb-devel] Calling all music fiends who have MBID tagged music collections!
Robert Kaye
2014-10-20 21:01:57 UTC
Permalink
Hi!

We're getting a new project off the ground and need your help!

MusicBrainz, along with MTG-UPF is working on a new project, AcousticBrainz. The goal is to compute audio features [0] and make them freely available for anybody to use for MIR research and to open source enthusiasts. We are generating audio features with the open source essentia [1] feature extractor system, which anybody can use and contribute to.

We hope that this open collection of features with known feature extractors can be used by people to make lots of cool stuff. The AcousticBrainz data includes a lot of data that had to previously been licensed from expensive sources -- now this data will be open source! This data can also serve as the building blocks for anyone to build their own recommendation or music discovery system. Clearly this will take some time to get here, but this is the first step.

We're hoping to get up to 1,000,000 scanned audio tracks by next week! We've 1/4 of the way there and we could really use your help to accomplish our goal.

We need the help of anyone with audio files on their computers to help generate new data. You don't need to send us any audio, we just ask that you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs (this is how we identify the music). If you haven't tagged your audio collection yet, take this opportunity to download Musicbrainz Picard [2] and identify and rename your music collection. The feature extractor runs at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly this is pretty slow, but it is going a very detailed music analysis!

To get started, go here: http://acousticbrainz.org/contribute

We have static builds for linux and we're working to release solutions for Windows and Mac very soon.

Thanks!

[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/

--

--ruaok

Robert Kaye -- ***@musicbrainz.org -- http://musicbrainz.org
Philipp Wolfer
2014-10-21 07:22:41 UTC
Permalink
Hi Robert,
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
that sounds pretty cool and I will scan my music collection for sure. I am
looking forward to see some of the data being used in real world
applications :)

How does AcousticBrainz currently deal with multiple submissions for the
same recording? I can see in the API data that the system stores the
characteristics of the analyzed files (bitrate, codec etc.). What happens
if I submit an analysis for the same track from a FLAC file and later from
a 128kbit MP3?

Phil
Alastair Porter
2014-10-21 07:59:50 UTC
Permalink
At the moment we're just storing all data that people send us, even if that
means high quality and low quality audio.
We did some testing beforehand with different audio encodings and found
that the difference between lossless and lossy audio was not very
significant.
Nevertheless, we're still thinking about the best way to present the best
choice when people want to download it. Any discussion on this topic would
be welcome.

Alastair
Post by Philipp Wolfer
Hi Robert,
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
that sounds pretty cool and I will scan my music collection for sure. I am
looking forward to see some of the data being used in real world
applications :)
How does AcousticBrainz currently deal with multiple submissions for the
same recording? I can see in the API data that the system stores the
characteristics of the analyzed files (bitrate, codec etc.). What happens
if I submit an analysis for the same track from a FLAC file and later from
a 128kbit MP3?
Phil
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Frederic Da Vitoria
2014-10-21 07:27:15 UTC
Permalink
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously been
licensed from expensive sources -- now this data will be open source! This
data can also serve as the building blocks for anyone to build their own
recommendation or music discovery system. Clearly this will take some time
to get here, but this is the first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next week!
We've 1/4 of the way there and we could really use your help to accomplish
our goal.
We need the help of anyone with audio files on their computers to help
generate new data. You don't need to send us any audio, we just ask that
you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs
(this is how we identify the music). If you haven't tagged your audio
collection yet, take this opportunity to download Musicbrainz Picard [2]
and identify and rename your music collection. The feature extractor runs
at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly
this is pretty slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release solutions for
Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,

Is there any chance the Windows version will be ready before the deadline?
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Alastair Porter
2014-10-21 07:57:54 UTC
Permalink
Hi Frederic,
We're hopefully going to finish the windows client by tomorrow or Thursday.
There's definitely enough time for you to contribute, although we're also
happy for contributions to continue past our "deadline". The idea is to
make the dataset continually grow.
I'll let you know here when the client is ready.

Alastair
Post by Frederic Da Vitoria
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously been
licensed from expensive sources -- now this data will be open source! This
data can also serve as the building blocks for anyone to build their own
recommendation or music discovery system. Clearly this will take some time
to get here, but this is the first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next week!
We've 1/4 of the way there and we could really use your help to accomplish
our goal.
We need the help of anyone with audio files on their computers to help
generate new data. You don't need to send us any audio, we just ask that
you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs
(this is how we identify the music). If you haven't tagged your audio
collection yet, take this opportunity to download Musicbrainz Picard [2]
and identify and rename your music collection. The feature extractor runs
at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly
this is pretty slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release solutions
for Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,
Is there any chance the Windows version will be ready before the deadline?
--
Frederic Da Vitoria
(davitof)
Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Frederic Da Vitoria
2014-10-21 08:55:53 UTC
Permalink
Post by Alastair Porter
Hi Frederic,
We're hopefully going to finish the windows client by tomorrow or
Thursday. There's definitely enough time for you to contribute, although
we're also happy for contributions to continue past our "deadline". The
idea is to make the dataset continually grow.
I'll let you know here when the client is ready.
Alastair
Post by Frederic Da Vitoria
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously been
licensed from expensive sources -- now this data will be open source! This
data can also serve as the building blocks for anyone to build their own
recommendation or music discovery system. Clearly this will take some time
to get here, but this is the first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next week!
We've 1/4 of the way there and we could really use your help to accomplish
our goal.
We need the help of anyone with audio files on their computers to help
generate new data. You don't need to send us any audio, we just ask that
you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs
(this is how we identify the music). If you haven't tagged your audio
collection yet, take this opportunity to download Musicbrainz Picard [2]
and identify and rename your music collection. The feature extractor runs
at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly
this is pretty slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release solutions
for Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,
Is there any chance the Windows version will be ready before the deadline?
OK. I'll try to download the VM this evening, just in case. IIRC, I already
have VirtualBox installed at home.

I guess the Windows version will be faster than running through a VM. I
hope the Linux version logs the files which have been collected, so that I
can start with the VM first and then move the handled files out of the
music folder before switching to the Windows version. Ah yes, I see that
"(If you're comfortable with Linux, you can log in with user ab, password
ab and then: tail -f submit.log". Well, I'm not comfortable with Linux, but
this http://www.computerhope.com/unix/ulogin.htm seems to suggest I should
open a console, then use the login command, then the tail command. Am I
correct? BTW, when I'm about to switch, how do I stop the Linux program?
Does it have a window which I can close properly, or is it some kind of
hidden process? Sorry, I have many questions, but you did not give us much
time before the deadline :-)
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Frederic Da Vitoria
2014-10-21 18:17:01 UTC
Permalink
Post by Frederic Da Vitoria
Post by Alastair Porter
Hi Frederic,
We're hopefully going to finish the windows client by tomorrow or
Thursday. There's definitely enough time for you to contribute, although
we're also happy for contributions to continue past our "deadline". The
idea is to make the dataset continually grow.
I'll let you know here when the client is ready.
Alastair
Post by Frederic Da Vitoria
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously been
licensed from expensive sources -- now this data will be open source! This
data can also serve as the building blocks for anyone to build their own
recommendation or music discovery system. Clearly this will take some time
to get here, but this is the first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next week!
We've 1/4 of the way there and we could really use your help to accomplish
our goal.
We need the help of anyone with audio files on their computers to help
generate new data. You don't need to send us any audio, we just ask that
you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs
(this is how we identify the music). If you haven't tagged your audio
collection yet, take this opportunity to download Musicbrainz Picard [2]
and identify and rename your music collection. The feature extractor runs
at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly
this is pretty slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release solutions
for Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,
Is there any chance the Windows version will be ready before the deadline?
OK. I'll try to download the VM this evening, just in case. IIRC, I
already have VirtualBox installed at home.
I guess the Windows version will be faster than running through a VM. I
hope the Linux version logs the files which have been collected, so that I
can start with the VM first and then move the handled files out of the
music folder before switching to the Windows version. Ah yes, I see that
"(If you're comfortable with Linux, you can log in with user ab, password
ab and then: tail -f submit.log". Well, I'm not comfortable with Linux, but
this http://www.computerhope.com/unix/ulogin.htm seems to suggest I
should open a console, then use the login command, then the tail command.
Am I correct? BTW, when I'm about to switch, how do I stop the Linux
program? Does it have a window which I can close properly, or is it some
kind of hidden process? Sorry, I have many questions, but you did not give
us much time before the deadline :-)
I have started the VM. I had a little issue with the folder name: my music
folder is named Music, and the VM already has a shared folder named Music.
VirtualBox wasn't smart enough to correct this automatically, all it did
was refuse to let me click on OK without any explanation. It took me a few
minutes to think of this and change the share name for my folder. Maybe I
could have removed the existing shared folder?

The work is slow, really slow. On my i5-3210M, it takes close to 30 seconds
to handle one track, and the software hasn't yet reached my classical music
tracks! I am going to have to stop it tomorrow morning. How should I do it?
Just stop the VM? If I restart it; will it automatically skip the tracks it
already analysed? Or should I fetch the list in submit.log and move those
files out of the way?
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Ian McEwen
2014-10-21 18:26:08 UTC
Permalink
Post by Frederic Da Vitoria
I have started the VM. I had a little issue with the folder name: my music
folder is named Music, and the VM already has a shared folder named Music.
VirtualBox wasn't smart enough to correct this automatically, all it did
was refuse to let me click on OK without any explanation. It took me a few
minutes to think of this and change the share name for my folder. Maybe I
could have removed the existing shared folder?
The work is slow, really slow. On my i5-3210M, it takes close to 30 seconds
to handle one track, and the software hasn't yet reached my classical music
tracks! I am going to have to stop it tomorrow morning. How should I do it?
Just stop the VM? If I restart it; will it automatically skip the tracks it
already analysed? Or should I fetch the list in submit.log and move those
files out of the way?
It keeps track of what it's already done -- the tracking is by filename,
so it's not necessarily useful if the sqlite DB is moved outside of the
VM (though you could go in and fix it manually if you felt inspired to),
but it should be fine as far as things done within the VM itself. The
submit.log tracking is separate -- that's entirely human-oriented and
for your own use.

I'll let Rob or someone else more familiar with the particular VM setup
address the question of stopping it (for running it directly, it's just
ctrl-c, but since this is running automatically in the background that
won't do anything).
Post by Frederic Da Vitoria
--
Frederic Da Vitoria
(davitof)
Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Frederic Da Vitoria
2014-10-21 19:46:23 UTC
Permalink
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
I have started the VM. I had a little issue with the folder name: my
music
Post by Frederic Da Vitoria
folder is named Music, and the VM already has a shared folder named
Music.
Post by Frederic Da Vitoria
VirtualBox wasn't smart enough to correct this automatically, all it did
was refuse to let me click on OK without any explanation. It took me a
few
Post by Frederic Da Vitoria
minutes to think of this and change the share name for my folder. Maybe I
could have removed the existing shared folder?
The work is slow, really slow. On my i5-3210M, it takes close to 30
seconds
Post by Frederic Da Vitoria
to handle one track, and the software hasn't yet reached my classical
music
Post by Frederic Da Vitoria
tracks! I am going to have to stop it tomorrow morning. How should I do
it?
Post by Frederic Da Vitoria
Just stop the VM? If I restart it; will it automatically skip the tracks
it
Post by Frederic Da Vitoria
already analysed? Or should I fetch the list in submit.log and move those
files out of the way?
It keeps track of what it's already done -- the tracking is by filename,
so it's not necessarily useful if the sqlite DB is moved outside of the
VM (though you could go in and fix it manually if you felt inspired to),
but it should be fine as far as things done within the VM itself. The
submit.log tracking is separate -- that's entirely human-oriented and
for your own use.
I'll let Rob or someone else more familiar with the particular VM setup
address the question of stopping it (for running it directly, it's just
ctrl-c, but since this is running automatically in the background that
won't do anything).
Thanks for the info.

From what I read in the VirtualBox documentation, the vm should pause
automatically when I put my laptop to sleep and wake up properly when I
awake my machine. This means that I shouldn't have any more issues, until I
try to switch to the Windows.
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Frederic Da Vitoria
2014-10-24 08:25:48 UTC
Permalink
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
I have started the VM. I had a little issue with the folder name: my
music
Post by Frederic Da Vitoria
folder is named Music, and the VM already has a shared folder named
Music.
Post by Frederic Da Vitoria
VirtualBox wasn't smart enough to correct this automatically, all it did
was refuse to let me click on OK without any explanation. It took me a
few
Post by Frederic Da Vitoria
minutes to think of this and change the share name for my folder. Maybe
I
Post by Frederic Da Vitoria
could have removed the existing shared folder?
The work is slow, really slow. On my i5-3210M, it takes close to 30
seconds
Post by Frederic Da Vitoria
to handle one track, and the software hasn't yet reached my classical
music
Post by Frederic Da Vitoria
tracks! I am going to have to stop it tomorrow morning. How should I do
it?
Post by Frederic Da Vitoria
Just stop the VM? If I restart it; will it automatically skip the
tracks it
Post by Frederic Da Vitoria
already analysed? Or should I fetch the list in submit.log and move
those
Post by Frederic Da Vitoria
files out of the way?
It keeps track of what it's already done -- the tracking is by filename,
so it's not necessarily useful if the sqlite DB is moved outside of the
VM (though you could go in and fix it manually if you felt inspired to),
but it should be fine as far as things done within the VM itself. The
submit.log tracking is separate -- that's entirely human-oriented and
for your own use.
I'll let Rob or someone else more familiar with the particular VM setup
address the question of stopping it (for running it directly, it's just
ctrl-c, but since this is running automatically in the background that
won't do anything).
Thanks for the info.
From what I read in the VirtualBox documentation, the vm should pause
automatically when I put my laptop to sleep and wake up properly when I
awake my machine. This means that I shouldn't have any more issues, until I
try to switch to the Windows.
I believe all my files have been analyzed but I am not quite sure. Here is
my problem:
Tail is still running. Most of the files scroll too fast to read, I am
guessing the extractor already handled them, but it stops on some of them,
and after a few dozens seconds issues an error. Sorry, my laptop is not
currently beside me, so that I can't tell the precise error, but I remember
there is "alloc" in it, maybe "bad_alloc".

I checked a few of those files and it seems they are all large files (at
least some of those I saw fail were around 20 minutes long). This evening,
during my 3 hours train trip, I'll move the log file to Windows (as I said
above, I don't know enough about Linux to do it inside the VM), check where
the errors are, check the sizes for each. I'll report back tomorrow.
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Alastair Porter
2014-10-24 10:16:25 UTC
Permalink
Hi Frederic,
Thanks for all your work contributing to this. I'll send you an email
off-list to see if we can work out where your errors are happening (if this
is a bug that we should fix then we should try and work it out)

Alastair
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
Post by Frederic Da Vitoria
I have started the VM. I had a little issue with the folder name: my
music
Post by Frederic Da Vitoria
folder is named Music, and the VM already has a shared folder named
Music.
Post by Frederic Da Vitoria
VirtualBox wasn't smart enough to correct this automatically, all it
did
Post by Frederic Da Vitoria
was refuse to let me click on OK without any explanation. It took me a
few
Post by Frederic Da Vitoria
minutes to think of this and change the share name for my folder.
Maybe I
Post by Frederic Da Vitoria
could have removed the existing shared folder?
The work is slow, really slow. On my i5-3210M, it takes close to 30
seconds
Post by Frederic Da Vitoria
to handle one track, and the software hasn't yet reached my classical
music
Post by Frederic Da Vitoria
tracks! I am going to have to stop it tomorrow morning. How should I
do it?
Post by Frederic Da Vitoria
Just stop the VM? If I restart it; will it automatically skip the
tracks it
Post by Frederic Da Vitoria
already analysed? Or should I fetch the list in submit.log and move
those
Post by Frederic Da Vitoria
files out of the way?
It keeps track of what it's already done -- the tracking is by filename,
so it's not necessarily useful if the sqlite DB is moved outside of the
VM (though you could go in and fix it manually if you felt inspired to),
but it should be fine as far as things done within the VM itself. The
submit.log tracking is separate -- that's entirely human-oriented and
for your own use.
I'll let Rob or someone else more familiar with the particular VM setup
address the question of stopping it (for running it directly, it's just
ctrl-c, but since this is running automatically in the background that
won't do anything).
Thanks for the info.
From what I read in the VirtualBox documentation, the vm should pause
automatically when I put my laptop to sleep and wake up properly when I
awake my machine. This means that I shouldn't have any more issues, until I
try to switch to the Windows.
I believe all my files have been analyzed but I am not quite sure. Here is
Tail is still running. Most of the files scroll too fast to read, I am
guessing the extractor already handled them, but it stops on some of them,
and after a few dozens seconds issues an error. Sorry, my laptop is not
currently beside me, so that I can't tell the precise error, but I remember
there is "alloc" in it, maybe "bad_alloc".
I checked a few of those files and it seems they are all large files (at
least some of those I saw fail were around 20 minutes long). This evening,
during my 3 hours train trip, I'll move the log file to Windows (as I said
above, I don't know enough about Linux to do it inside the VM), check where
the errors are, check the sizes for each. I'll report back tomorrow.
--
Frederic Da Vitoria
(davitof)
Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Chad Wilson
2014-10-25 18:06:48 UTC
Permalink
Post by Frederic Da Vitoria
I believe all my files have been analyzed but I am not quite sure.
Tail is still running. Most of the files scroll too fast to read, I am
guessing the extractor already handled them, but it stops on some of
them, and after a few dozens seconds issues an error. Sorry, my laptop
is not currently beside me, so that I can't tell the precise error,
but I remember there is "alloc" in it, maybe "bad_alloc".
I checked a few of those files and it seems they are all large files
(at least some of those I saw fail were around 20 minutes long). This
evening, during my 3 hours train trip, I'll move the log file to
Windows (as I said above, I don't know enough about Linux to do it
inside the VM), check where the errors are, check the sizes for each.
I'll report back tomorrow.
For what it's worth, I had the same problem with a few hour+ long DJ
mixes and an hour long interview with an artist as part of a deluxe
album (all MP3, around 85MB so not uber quality).

512MB just wasn't enough memory allocated to the VM for it to cope with
these it seems. It'd get stuck for ages and eventually bad_alloc. I
increased the VM size in VirtualBox to allow 4GB and it worked fine.
Watching top while it was doing it indicated streaming_extractor_music
getting up to around 2-2.5GB resident memory at peak for these files.

Cheers
Chad / voiceinsideyou
Frederic Da Vitoria
2014-10-25 23:03:53 UTC
Permalink
Post by Chad Wilson
Post by Frederic Da Vitoria
I believe all my files have been analyzed but I am not quite sure.
Tail is still running. Most of the files scroll too fast to read, I am
guessing the extractor already handled them, but it stops on some of
them, and after a few dozens seconds issues an error. Sorry, my laptop
is not currently beside me, so that I can't tell the precise error,
but I remember there is "alloc" in it, maybe "bad_alloc".
I checked a few of those files and it seems they are all large files
(at least some of those I saw fail were around 20 minutes long). This
evening, during my 3 hours train trip, I'll move the log file to
Windows (as I said above, I don't know enough about Linux to do it
inside the VM), check where the errors are, check the sizes for each.
I'll report back tomorrow.
For what it's worth, I had the same problem with a few hour+ long DJ
mixes and an hour long interview with an artist as part of a deluxe
album (all MP3, around 85MB so not uber quality).
512MB just wasn't enough memory allocated to the VM for it to cope with
these it seems. It'd get stuck for ages and eventually bad_alloc. I
increased the VM size in VirtualBox to allow 4GB and it worked fine.
Watching top while it was doing it indicated streaming_extractor_music
getting up to around 2-2.5GB resident memory at peak for these files.
Cheers
Chad / voiceinsideyou
Ah, so that's where the problem came from. From my tests, the limit seems
to be around 23 minutes. I'll try to increase the memory size and let the
extractor handle the few files remaining in my collection.
--
Frederic Da Vitoria
(davitof)

Membre de l'April - « promouvoir et défendre le logiciel libre » -
http://www.april.org
Paul Taylor
2014-10-21 08:00:19 UTC
Permalink
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make
them freely available for anybody to use for MIR research and to
open source enthusiasts. We are generating audio features with the
open source essentia [1] feature extractor system, which anybody
can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously
been licensed from expensive sources -- now this data will be open
source! This data can also serve as the building blocks for anyone
to build their own recommendation or music discovery system.
Clearly this will take some time to get here, but this is the
first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next
week! We've 1/4 of the way there and we could really use your help
to accomplish our goal.
We need the help of anyone with audio files on their computers to
help generate new data. You don't need to send us any audio, we
just ask that you download our extractor and submission program.
The only requirement is that your files are tagged with
Musicbrainz IDs (this is how we identify the music). If you
haven't tagged your audio collection yet, take this opportunity to
download Musicbrainz Picard [2] and identify and rename your music
collection. The feature extractor runs at about 20x real-time,
that is about 10 seconds per 3-minute song -- sadly this is pretty
slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release
solutions for Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,
Is there any chance the Windows version will be ready before the deadline?
And OSX version, I would like to promote this initiative, but the 1GB
virtual machine is not a nice option for most, an eta would be helpful !

Paul
Alastair Porter
2014-10-21 08:28:56 UTC
Permalink
We have an osx version compiled but not tested. Once it's tested you can
use the same cli submitter that we have for linux, or wait a few days more
for the gui.
I'll keep you up to date.

Alastair
Post by Frederic Da Vitoria
Post by Robert Kaye
Hi!
We're getting a new project off the ground and need your help!
MusicBrainz, along with MTG-UPF is working on a new project,
AcousticBrainz. The goal is to compute audio features [0] and make them
freely available for anybody to use for MIR research and to open source
enthusiasts. We are generating audio features with the open source essentia
[1] feature extractor system, which anybody can use and contribute to.
We hope that this open collection of features with known feature
extractors can be used by people to make lots of cool stuff. The
AcousticBrainz data includes a lot of data that had to previously been
licensed from expensive sources -- now this data will be open source! This
data can also serve as the building blocks for anyone to build their own
recommendation or music discovery system. Clearly this will take some time
to get here, but this is the first step.
We're hoping to get up to 1,000,000 scanned audio tracks by next week!
We've 1/4 of the way there and we could really use your help to accomplish
our goal.
We need the help of anyone with audio files on their computers to help
generate new data. You don't need to send us any audio, we just ask that
you download our extractor and submission program.
The only requirement is that your files are tagged with Musicbrainz IDs
(this is how we identify the music). If you haven't tagged your audio
collection yet, take this opportunity to download Musicbrainz Picard [2]
and identify and rename your music collection. The feature extractor runs
at about 20x real-time, that is about 10 seconds per 3-minute song -- sadly
this is pretty slow, but it is going a very detailed music analysis!
To get started, go here: http://acousticbrainz.org/contribute
We have static builds for linux and we're working to release solutions
for Windows and Mac very soon.
Thanks!
[0] http://acousticbrainz.org/sample-data
[1] https://github.com/MTG/essentia
[2] http://picard.musicbrainz.org/
Hello,
Is there any chance the Windows version will be ready before the deadline?
And OSX version, I would like to promote this initiative, but the 1GB
virtual machine is not a nice option for most, an eta would be helpful !
Paul
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Loading...