Overview
“March Madness is a phenomenon that grips the national sports psyche from the second week of March through the first week of April. March Madness is the ... NCAA Men's and Women's Basketball tournaments that determine the national champions of college basketball”
Ref: https://entertainment.howstuffworks.com/march-madness.htm
Through a series of four contests, you will build a tool that gathers information on teams and players by analyzing textual material from top sports writers and commentators- the material could include game reports, columns, Twitter feed, blogs. The tool should be able to help categorize players based on their skills, temperament, role/position etc, from these materials, using the capabilities of IBM Watson.
This set of contests will build on each other - so it will be of value to make each solution generic enough so that it could be reused in the next contest.
Description
In this first challenge, you are looking at teams with regard to bracket predictions.
The analysis of a document could change based on the phrases that are important to its context. The example provided in the IBM Watson Natural Language Understanding (NLU) service explains this with regard to a snippet from Martin Luther King's speech. With the target phrase as 'the American dream' , the sentiment of the document and certain keywords in the document change.
The task in this contest is to identify some words and phrases that are important in the context of March Madness and Basketball to enhance and enrich the information obtained from a document.
Consider this report:
If the whole document is fed into the NLU system as is, the impact of the individual analysis of the four teams considered is lost. However if the reports on Syracuse and St Bonaventure are fed into the NLU system separately, there is a distinct sentiment that is expressed about those teams. Further if phrases such as “uphill climb” and “nervous start” tare targeted, the negative sentiment of the writer about Syracuse gets magnified. Similarly the phrase “good shape” takes on a positive sentiment when it is targeted - indicating that the commentator favours St Bonaventure.
This is a simple example to show how minor tweaking of one NLU feature could change the analysis. Could you exploit the many features of NLU to create a richer model? Could you make this a more relevant model in the context of March Madness and College Basketball?
Select four colleges - they could be Bubble/fringe teams, tipped for the top 16, or even for the championship. The top 68 teams were announced on Selection Sunday, March 11.
Take 10 to 15 documents per college and build your model using the attributes of the NLU system- sentiment, category, concept, entities and relations. Select some questions that will be of interest to a Basketball analyst. Examples:
- What were the teams that were tipped to make the make the top 68 but did not? What was the news/comments associated with it?
- Which teams were tipped for the top 16 seeds? Who made it and who did not - what were their strengths and weaknesses?
- Which report sites or which commentators are most reliable?
You could look at sites like NCAA, Yahoo sports, ESPN, Sporting news etc. Note that sports writing and journalism have their own jargon that is full of hyperbole and includes the language of war and theatre to convey the intensity of the games.
Requirements
- Please join the Topcoder Cognitive Community if you have not already, and get an IBM Cloud Account by using this link.
- Choose a few questions and show how your answers are a distinct improvement from the answers using the default model.
- The reviewer should be able to feed a different set of documents and verify the richness of your model.
- Provide a way for the reviewer to input data- urls, text files or json formats.
- Specify the structure of the input data if it is text or json. If it is a url, specify the sites that could be used to input the url.
- ���Specify the restrictions on your solutions and what inputs the reviewers could choose to input into your queries.
Final Submission Guidelines
Final Submission Guidelines
-
Deploy your application to your own IBM Cloud instance.
-
Upload a .zip containing your source code and a text file called ibm-cloud-deployment.txt. This .txt file should contain the URL defined above for testing.
-
You can use any programming language to build the application, as long it’s supported by IBM Cloud, has an api, provides a UI, and meets the spec criteria.
-
Detailed instructions on deploying and testing it locally.
Review Guidelines
-
Richness of Model
���������a. Does the model utilize and exploit NLU features?
b. Quality and complexity of queries addressed by the model?
c. Richness of model will not score any points if there is no implementation. However, the implementation could be for a section of the model.
-
Implementation
���������a. IBM Discovery features used to demonstrate the solution
b. Design and code quality.
-
Documentation
a. A document explaining your approach on how you have enhanced the default scores given by IBM Watson. Did anything not work you way you expected? How are you enhancing the output?
b. demo video of your solution
-
Ease of Use
a. How easy it is to set up and test the solution
b. User Interface - Functional interface should be sufficient to get a pass score.
-
Performance on new/unknown data
a. How well does the solution perform against new data?