Discussion:
[mb-devel] Ratings and throttling
Joe Martinez
2015-02-16 17:10:51 UTC
Permalink
I am developing a music application that does the following:

1) Encourages users to rate their songs by prompting the user to rate a
song in his/her library while it's playing, if the user has not already
rated it.

2) Allows the user to choose his/her desired mix of ratings (i.e. how
frequently 5-star songs play, compared to 4-star songs, etc.), as well as
what frequency of unrated songs are chosen, when shuffling or creating auto
playlists.

I was planning on using ID3 popularimiter tags to do this directly in the
MP3 files, but I was discussing my idea with Robert Kaye awhile back, and
he recommended that I store the user ratings in MusicBrainz instead, so
that a user's ratings would be consistent across devices, and also so that
MusicBrainz could build up its aggregate rating data for the benefit of
other users.

So, I've been working on this, and #1 is working great.

The problem is that on #2, I am running into throttling issues. It
requires the application to query the user-rating of each song in the
user's library when the application starts, so that it will have that data
available to create the desired mix. I have been querying the ratings
using:

/ws/2/recording/<MBID>?inc=user-ratings

I do that query in a loop, which of course generates a lot of requests to
the server, and I quickly start getting throttling errors. If I were to
put a delay into the loop to do only 1 query per second, it would take
forever to get all the data.

So, I'm wondering if there's a better (more efficient) way to retrieve all
of a user's ratings that wouldn't cause throttling issues. Is there maybe
an endpoint that would allow me to grab all of the authenticated user's
ratings at one time (like the musicbrainz.org/user/<USERNAME>/ratings web
page, but in XML? If not, any other ideas?

-Joe
Ulrich Klauer
2015-02-16 19:33:01 UTC
Permalink
Post by Joe Martinez
So, I'm wondering if there's a better (more efficient) way to retrieve all
of a user's ratings that wouldn't cause throttling issues. Is there maybe
an endpoint that would allow me to grab all of the authenticated user's
ratings at one time (like the musicbrainz.org/user/<USERNAME>/ratings web
page, but in XML? If not, any other ideas?
I don't think there is a better way at the moment, unfortunately. It's
one of the many deficiencies of the current web service.

There is even a feature request that is four years old:
http://tickets.musicbrainz.org/browse/MBS-1160 (still refers to ws/1)
So, the only way would be to implement that ticket.

Ulrich
Ian McEwen
2015-02-17 02:37:03 UTC
Permalink
Post by Joe Martinez
1) Encourages users to rate their songs by prompting the user to rate a
song in his/her library while it's playing, if the user has not already
rated it.
2) Allows the user to choose his/her desired mix of ratings (i.e. how
frequently 5-star songs play, compared to 4-star songs, etc.), as well as
what frequency of unrated songs are chosen, when shuffling or creating auto
playlists.
I was planning on using ID3 popularimiter tags to do this directly in the
MP3 files, but I was discussing my idea with Robert Kaye awhile back, and
he recommended that I store the user ratings in MusicBrainz instead, so
that a user's ratings would be consistent across devices, and also so that
MusicBrainz could build up its aggregate rating data for the benefit of
other users.
So, I've been working on this, and #1 is working great.
The problem is that on #2, I am running into throttling issues. It
requires the application to query the user-rating of each song in the
user's library when the application starts, so that it will have that data
available to create the desired mix. I have been querying the ratings
/ws/2/recording/<MBID>?inc=user-ratings
I do that query in a loop, which of course generates a lot of requests to
the server, and I quickly start getting throttling errors. If I were to
put a delay into the loop to do only 1 query per second, it would take
forever to get all the data.
So, I'm wondering if there's a better (more efficient) way to retrieve all
of a user's ratings that wouldn't cause throttling issues. Is there maybe
an endpoint that would allow me to grab all of the authenticated user's
ratings at one time (like the musicbrainz.org/user/<USERNAME>/ratings web
page, but in XML? If not, any other ideas?
The only thing I can think of offhand is that you can get recording
ratings via release requests and via recording browse requests, e.g. by
artist. So if you have release tags and artist tags you may be able to
get ratings for all recordings at once (or, in the case of browse
requests, up to 100 at once via the limit parameter).

I don't think there's a straightforward "get all ratings by X user", but
this might help you some, anyway.
Post by Joe Martinez
-Joe
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Joe Martinez
2015-02-17 04:34:01 UTC
Permalink
Post by Ian McEwen
The only thing I can think of offhand is that you can get recording
ratings via release requests and via recording browse requests, e.g. by
artist. So if you have release tags and artist tags you may be able to
get ratings for all recordings at once (or, in the case of browse
requests, up to 100 at once via the limit parameter).
If I did it by release, then I'd need to do a separate query for each
release, right? I guess that would be faster than one request per
recording, but if I have 600 albums, then that would take about 10
minutes. Still a bit slow. And, if a release had more than 25 recordings,
then it would get truncated.

Doing a browse request by artist I think would have a similar problem. If
I have 300 artists, that's 300 requests, each of which may have several
100-recording pages, so that might take just as long.

And since in both of these cases, I'm getting back lots of data that I
don't care about (recordings that I don't own), it's probably not too nice
to the server.

Thanks for the info and ideas, though. I really hope somebody implements
the feature request that Ulrich linked to. It would make eveything nice
and clean. It doesn't sound like too difficult a job. I'd possibly
volunteer myself if I knew Python :)

So, it looks like I might have to resort to using the user rating web page
and scraping the HTML (as the feature request mentions), and just hope that
the format doesn't change :(

-Joe
Ian McEwen
2015-02-17 07:43:13 UTC
Permalink
Post by Joe Martinez
Post by Ian McEwen
The only thing I can think of offhand is that you can get recording
ratings via release requests and via recording browse requests, e.g. by
artist. So if you have release tags and artist tags you may be able to
get ratings for all recordings at once (or, in the case of browse
requests, up to 100 at once via the limit parameter).
If I did it by release, then I'd need to do a separate query for each
release, right? I guess that would be faster than one request per
recording, but if I have 600 albums, then that would take about 10
minutes. Still a bit slow. And, if a release had more than 25 recordings,
then it would get truncated.
I believe these would not be truncated, since it's not a listing of
pages, it's listed in the context of the tracks. So I believe you'd get
the whole release's worth of recordings. There's also browsing
recordings by release, of course, which would certainly make everything
available, if possibly by multiple pages.

Other concerns definitely still apply. Just figured I'd mention this,
since it could at least be somewhat fewer requests (in an ideal case,
100fold reduction).
Post by Joe Martinez
Doing a browse request by artist I think would have a similar problem. If
I have 300 artists, that's 300 requests, each of which may have several
100-recording pages, so that might take just as long.
And since in both of these cases, I'm getting back lots of data that I
don't care about (recordings that I don't own), it's probably not too nice
to the server.
Certainly. The release one I figured was possibly more straightforward,
since it's a frequent case that someone has whole releases. Whole-artist
requests would probably not be except in very exceptional cases.
Post by Joe Martinez
Thanks for the info and ideas, though. I really hope somebody implements
the feature request that Ulrich linked to. It would make eveything nice
and clean. It doesn't sound like too difficult a job. I'd possibly
volunteer myself if I knew Python :)
Perl, in point of fact, not Python :) but I suspect that's no better for
your case.

Incidentally, I notice that it's undocumented, but looking at the code,
a more bandwidth-intensive rating lookup does exist at
/ws/2/rating?entity=<type>&id=<mbid> -- perhaps that could be relatively
rapidly extended to support what's desired here. (I notice it returns
nothing when there's no rating, just <metadata/>, and 1-100 values when
there is one, in the light of previous observations of inconsistency).

Alternatively to the way that ticket presents it, perhaps allowing
something like browse requests would be more reasonable --
/ws/2/recording?rating_editor=whoever&limit=100 or so (and, presumably,
tag_editor or such). That'd keep it more consistent with the typical
case where recording (or whatever else) information is also desired.
Post by Joe Martinez
So, it looks like I might have to resort to using the user rating web page
and scraping the HTML (as the feature request mentions), and just hope that
the format doesn't change :(
-Joe
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Paul Taylor
2015-02-17 09:13:35 UTC
Permalink
Post by Joe Martinez
So, it looks like I might have to resort to using the user rating web
page and scraping the HTML (as the feature request mentions), and just
hope that the format doesn't change :(
You would be better off spending your time agreeing and implementing a
solution in Musicbrainz that everyone can use rather than trying to
circucmvent the webservice
Robert Kaye
2015-02-17 09:59:08 UTC
Permalink
Post by Paul Taylor
You would be better off spending your time agreeing and implementing a
solution in Musicbrainz that everyone can use rather than trying to
circucmvent the webservice
Says the person who built himself an entire web-service to avoid the MusicBrainz web service. :)

--

--ruaok

Robert Kaye -- ***@musicbrainz.org -- http://musicbrainz.org
Paul Taylor
2015-02-17 10:36:39 UTC
Permalink
Post by Robert Kaye
Post by Paul Taylor
You would be better off spending your time agreeing and implementing a
solution in Musicbrainz that everyone can use rather than trying to
circucmvent the webservice
Says the person who built himself an entire web-service to avoid the MusicBrainz web service. :)
--
--ruaok
heh,

But to be clear this was mainly because I had to create a webservice for
Discogs anyway because they made changes that made their webservice no
longer that useful to me
but also I had discussed things like improving rate request for paying
applications with Musicbrainz and was not able to come to any solution.

Paul
Joe Martinez
2015-02-20 18:01:24 UTC
Permalink
Ian,

Thanks for the suggestion. I went ahead and implemented it using release
requests, and it is working quite well. It still takes maybe 15-20 minutes
to get through all of my collection (about 3,800 songs), and it gets slower
as time goes on, as the randomness hits the larger releases early in the
process, and there are mostly singletons left at the end. But, I can start
using the data pretty much right away, and only play songs that it has
cached the ratings for. It still would be great to just query all of my
ratings directly, but this will do for now.

-Joe
Post by Ian McEwen
Post by Joe Martinez
1) Encourages users to rate their songs by prompting the user to rate a
song in his/her library while it's playing, if the user has not already
rated it.
2) Allows the user to choose his/her desired mix of ratings (i.e. how
frequently 5-star songs play, compared to 4-star songs, etc.), as well as
what frequency of unrated songs are chosen, when shuffling or creating
auto
Post by Joe Martinez
playlists.
I was planning on using ID3 popularimiter tags to do this directly in the
MP3 files, but I was discussing my idea with Robert Kaye awhile back, and
he recommended that I store the user ratings in MusicBrainz instead, so
that a user's ratings would be consistent across devices, and also so
that
Post by Joe Martinez
MusicBrainz could build up its aggregate rating data for the benefit of
other users.
So, I've been working on this, and #1 is working great.
The problem is that on #2, I am running into throttling issues. It
requires the application to query the user-rating of each song in the
user's library when the application starts, so that it will have that
data
Post by Joe Martinez
available to create the desired mix. I have been querying the ratings
/ws/2/recording/<MBID>?inc=user-ratings
I do that query in a loop, which of course generates a lot of requests to
the server, and I quickly start getting throttling errors. If I were to
put a delay into the loop to do only 1 query per second, it would take
forever to get all the data.
So, I'm wondering if there's a better (more efficient) way to retrieve
all
Post by Joe Martinez
of a user's ratings that wouldn't cause throttling issues. Is there
maybe
Post by Joe Martinez
an endpoint that would allow me to grab all of the authenticated user's
ratings at one time (like the musicbrainz.org/user/<USERNAME>/ratings
web
Post by Joe Martinez
page, but in XML? If not, any other ideas?
The only thing I can think of offhand is that you can get recording
ratings via release requests and via recording browse requests, e.g. by
artist. So if you have release tags and artist tags you may be able to
get ratings for all recordings at once (or, in the case of browse
requests, up to 100 at once via the limit parameter).
I don't think there's a straightforward "get all ratings by X user", but
this might help you some, anyway.
Post by Joe Martinez
-Joe
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Paul Taylor
2015-02-17 09:25:34 UTC
Permalink
Post by Joe Martinez
1) Encourages users to rate their songs by prompting the user to rate
a song in his/her library while it's playing, if the user has not
already rated it.
2) Allows the user to choose his/her desired mix of ratings (i.e. how
frequently 5-star songs play, compared to 4-star songs, etc.), as well
as what frequency of unrated songs are chosen, when shuffling or
creating auto playlists.
I was planning on using ID3 popularimiter tags to do this directly in
the MP3 files, but I was discussing my idea with Robert Kaye awhile
back, and he recommended that I store the user ratings in MusicBrainz
instead, so that a user's ratings would be consistent across devices,
and also so that MusicBrainz could build up its aggregate rating data
for the benefit of other users.
Submitting ratings to Musicbrainz is certainly good for the benefits of
others, but I don't think relying totally on this is scalable especially
with the webservice as it is. I would recommend you also store the
ratings within ID3, if you are only ever modifying your ratings with
your own application then its easy enough to keep them in sync, you
could also add a Force Sync button to force an overwrite of ID3 field
with whatever is in Musicbrainz.


Paul
Loading...