Discussion:
[mb-devel] Create a unified browse and search interface (GSoC)
Timo Richter
2015-03-23 17:11:49 UTC
Permalink
Dear Nikki,
dear Michael,
dear developers,

my name is Timo, I am studying computer science in Germany and Portugal
since 2010. I have developed programs using Javascript, Python, SQL, Java
and once using Lucene.
I have some ideas on a unified search interface. As the frontend I would
follow Nikki's draft.[1] But besides artists it should also be able to find
a specific entry in the database like a release or a medium. First it is
important to analyse what users search for and optimise the search function
for this. Does he or she enter only the artist and work name? Would anyone
include the label or the year to filter the results? Numbers in the query
which are definitely year dates will be interpreted as such and filter the
result. I would even like to create an inverted index for the search,
pointing from each single word within each artist name and album name to a
specific finding. A finding can be an artist, a release or a medium for
example. The search query will be splitted in words as well and the
intersection of each word's corresponding findings shall be presented on
top.
It will be helpful to record search queries for later optimisations, like
automatical correction of typing errors.[2]
The working steps are:
1. Have the search index created regularly
2. Construct a search page frontend
3. Program the backend that parses the query and retrieves the results
4. Improve the quality of the results

Concerning first step, where in the source code are the indexes being
created “each 3 hours“? I could not find anything in the cron directory.


Best wishes,

Timo


[1] http://mbsandbox.org/~nikki/browseartists/ . 23/03/15
[2] http://hughewilliams.com/2012/03/19/query-rewriting-in-search-engines/
. 23/03/15
Michael Wiencek
2015-03-23 18:42:29 UTC
Permalink
Hi Timo,

I think the improvements you propose to the server-server/indexer
(which are hosted at [1] and [2]) would be way out of scope for this
project, since just creating the interface itself would be a huge
task. Note also that there's already another project on our GSoC ideas
page for finishing last year's search code improvements, so it
wouldn't make sense to start any big projects on the existing code.

The primary thing a student would work on for the unified browse and
search is the UI itself, which'll require knowledge of JavaScript
(recently we started using React.js for interface components) plus our
web service and its search syntax. Backend changes will need some
understanding of Perl, though it wouldn't be a huge part of the
project.

Michael

[1] https://bitbucket.org/metabrainz/search-server
[2] https://github.com/metabrainz/search-indexer
Post by Timo Richter
Dear Nikki,
dear Michael,
dear developers,
my name is Timo, I am studying computer science in Germany and Portugal
since 2010. I have developed programs using Javascript, Python, SQL, Java
and once using Lucene.
I have some ideas on a unified search interface. As the frontend I would
follow Nikki's draft.[1] But besides artists it should also be able to find
a specific entry in the database like a release or a medium. First it is
important to analyse what users search for and optimise the search function
for this. Does he or she enter only the artist and work name? Would anyone
include the label or the year to filter the results? Numbers in the query
which are definitely year dates will be interpreted as such and filter the
result. I would even like to create an inverted index for the search,
pointing from each single word within each artist name and album name to a
specific finding. A finding can be an artist, a release or a medium for
example. The search query will be splitted in words as well and the
intersection of each word's corresponding findings shall be presented on
top.
It will be helpful to record search queries for later optimisations, like
automatical correction of typing errors.[2]
1. Have the search index created regularly
2. Construct a search page frontend
3. Program the backend that parses the query and retrieves the results
4. Improve the quality of the results
Concerning first step, where in the source code are the indexes being
created “each 3 hours“? I could not find anything in the cron directory.
Best wishes,
Timo
[1] http://mbsandbox.org/~nikki/browseartists/ . 23/03/15
[2] http://hughewilliams.com/2012/03/19/query-rewriting-in-search-engines/ .
23/03/15
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Timo Richter
2015-03-24 00:08:50 UTC
Permalink
Hi Michael,

I was worring that only the UI would not be enough. So I can just expand
the artistsearch to find other elements as well. And the syntax for the
search stays the same? Then I would really use JavaScript on the client,
maybe react.js, and transform the options from a checklist together with a
search word into the query.
In the next step, the UI may consider mobile devices and devices without
javascript. Is it likely that a mobile device user uses the musicbranz
search?


Timo
Post by Michael Wiencek
Hi Timo,
I think the improvements you propose to the server-server/indexer
(which are hosted at [1] and [2]) would be way out of scope for this
project, since just creating the interface itself would be a huge
task. Note also that there's already another project on our GSoC ideas
page for finishing last year's search code improvements, so it
wouldn't make sense to start any big projects on the existing code.
The primary thing a student would work on for the unified browse and
search is the UI itself, which'll require knowledge of JavaScript
(recently we started using React.js for interface components) plus our
web service and its search syntax. Backend changes will need some
understanding of Perl, though it wouldn't be a huge part of the
project.
Michael
[1] https://bitbucket.org/metabrainz/search-server
[2] https://github.com/metabrainz/search-indexer
Post by Timo Richter
Dear Nikki,
dear Michael,
dear developers,
my name is Timo, I am studying computer science in Germany and Portugal
since 2010. I have developed programs using Javascript, Python, SQL, Java
and once using Lucene.
I have some ideas on a unified search interface. As the frontend I would
follow Nikki's draft.[1] But besides artists it should also be able to
find
Post by Timo Richter
a specific entry in the database like a release or a medium. First it is
important to analyse what users search for and optimise the search
function
Post by Timo Richter
for this. Does he or she enter only the artist and work name? Would
anyone
Post by Timo Richter
include the label or the year to filter the results? Numbers in the query
which are definitely year dates will be interpreted as such and filter
the
Post by Timo Richter
result. I would even like to create an inverted index for the search,
pointing from each single word within each artist name and album name to
a
Post by Timo Richter
specific finding. A finding can be an artist, a release or a medium for
example. The search query will be splitted in words as well and the
intersection of each word's corresponding findings shall be presented on
top.
It will be helpful to record search queries for later optimisations, like
automatical correction of typing errors.[2]
1. Have the search index created regularly
2. Construct a search page frontend
3. Program the backend that parses the query and retrieves the results
4. Improve the quality of the results
Concerning first step, where in the source code are the indexes being
created “each 3 hours“? I could not find anything in the cron directory.
Best wishes,
Timo
[1] http://mbsandbox.org/~nikki/browseartists/ . 23/03/15
[2]
http://hughewilliams.com/2012/03/19/query-rewriting-in-search-engines/ .
Post by Timo Richter
23/03/15
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
_______________________________________________
MusicBrainz-devel mailing list
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel
Loading...