SC4021 Neighborhood Assigment 1 Neutral In this neighborhood assignment, it’s seemingly you’ll per chance produce an belief search engine. Given a explicit topic of your replace (e.g., cryptocurrencies), your draw ought to tranquil enable customers to get connected opinions about any occasion of such topic (e.g., bitcoin) and make sentiment prognosis on the outcomes (e.g., opinions about bitcoin are 70% determined and 30% negative). Whilst it’s seemingly you’ll per chance fair comprise chosen a subject, be determined a) it’s seemingly you’ll per chance get ample files about it (e.g., some matters may per chance fair be too arena of interest to the level that you just would only get a few hundred files about it) and b) the opinions you glean are balanced (e.g., if the subject you chose only has negative opinions connected to it, it’s potentially no longer a staunch topic). For suggestions about interesting matters, it’s seemingly you’ll per chance per chance test our project page at https://sentic.score/projects. You are stunning much free to make disclose of one thing you have to by methodology of in the market instruments/libraries. Alternatively, your draw cannot be staunch a mashup of existing products and companies. Your final rating will depend no longer only on the methodology you developed your draw nonetheless also on its novelty and your creativity: in other words, to glean a high rating you attain no longer only deserve to implement a draw that works, nonetheless also a draw that is useful and user-friendly. 2 Deadline and Grouping The assignment constitutes 30% of your total grade for the direction. Assignments are to be submitted by Blackboard (email submissions may per chance no longer be regarded as as) by 12th April at 11:59pm SGT. 5% functions will be deducted for every rounded-off day after the closing date. Most interesting the first submission counts (Blackboard allows for more than one submissions only in case of draw errors or disconnections). The assignment will be performed in groups of 6 or 5 of us. Contributors will be randomly assigned by the draw for fairness. The major projects of the assignment are crawling (20 functions), indexing (40 functions) and classification (40 functions). Whenever you happen to worship, it’s seemingly you’ll per chance per chance break up your neighborhood into up to three subgroups caring for every of these projects and specify who did what on your final document so that every member will be graded accordingly. If this files is no longer specified, a explicit grade will be given to the total project and this may per chance also be shared among all participants of the neighborhood (instructed option). Some overlap between projects from completely different groups is allowed nonetheless beware that, if we discover out that a project has greater than 30% overlap with one other project from this 300 and sixty five days or past years, your neighborhood will be disqualified (and glean zero functions as final grade for the assignment). As a result of this truth, it is OK to fragment the final belief of your assignment with other groups nonetheless no longer the implementation predominant functions. 3 Tasks and Questions 3.1 Crawling (20 functions) Breeze text files from any sources which you are drawn to and popular to glean admission to, e.g., X API or Reddit API. The crawled corpus ought to tranquil comprise at least 10,000 files and at least 100,000 words. It’s OK to make disclose of in the market datasets for training (e.g., fashionable sentiment benchmarks), nonetheless you tranquil deserve to at least recede and imprint files for checking out. To your have evaluation dataset, disclose the the same structure as the above-mentioned sentiment benchmarks (using a explicit structure will pause in demerit functions) and title it “eval.xls”. Also, kind determined your dataset would no longer comprise duplicates and set terminate a scrutinize at your absolute most realistic to kind it balanced (e.g., equal replace of determined and negative entries). Sooner than crawling any files, reasonably take into legend the questions on this cloth, e.g., test whether or no longer the tips comprise ample predominant functions to respond the questions. You may per chance per chance per chance disclose any third glean collectively libraries for the crawling process, e.g.: Jsoup: https://jsoup.org Twitter4j: https://twitter4j.org Fb marketing: https://developers.fb.com/docs/marketing-apis Instagram: https://instagram.com/developer Amazon: https://github.com/ivangp5/amazon-crawler Tinder: https://gist.github.com/rtt/10403467 Tik Tok: https://developers.tiktok.com Quiz 1: Display and supply the next: The methodology you crawled the corpus (e.g., source, key phrases, API, library) and saved it What more or much less files customers may per chance desire to retrieve from your crawled corpus (i.e., functions), with sample queries The numbers of files, words, and kinds (i.e., uncommon words) in the corpus 3.2 Indexing (40 functions) Indexing: You may per chance per chance per chance attain this from scratch or disclose some mixture of in the market instruments, e.g., Solr+Lucene+Jetty. Solr runs as a standalone elephantine-text search server internal a servlet container corresponding to Jetty. Solr uses the Lucene search library at its core for elephantine-text indexing and search, and has REST-worship HTTP/XML and JSON APIs that kind it easy to make disclose of from nearly any programming language. Precious documentations comprise: Solr project: https://solr.apache.org Solr wiki: https://wiki.apache.org/solr/FrontPage Lucene tutorial: https://lucene.apache.org/core/quickstart.html Solr with Jetty: https://wiki.apache.org/solr/SolrJetty Jetty tutorial: https://jetty.org You are going to moreover resolve other inverted-index text search engine start projects, e.g., Sphinx, Nutch, and Lemur. Alternatively, you have to tranquil NOT merely adopt SQL-basically based concepts for text search (as an instance, you CANNOT resolve text search merely using Microsoft Sqlserver or MySql). Querying: You’d like to provide a straightforward nonetheless friendly user interface (UI) for querying. It will be either a web-basically based or mobile app basically based UI. You are going to reveal JSP in Java or Django in Python to make your UI web say. Since Solr gives REST-worship APIs to glean admission to indexes, one further JSON or RESTful library may per chance per chance be ample. Otherwise, it’s seemingly you’ll per chance fair disclose any third glean collectively library. The UI ought to be saved easy. A classy UI is no longer major nor inspired (because it’s no longer the level of hobby of this direction). Detailed files besides text is allowed to be proven for the quiz results, e.g., product photos on Amazon, rankings on Amazon, and pictures on Instagram. The predominant functions ought to tranquil be designed to resolve particular problems. Quiz 2: Compose the next projects: Create a straightforward UI (it’s seemingly you’ll per chance per chance produce one from scratch otherwise it’s seemingly you’ll per chance per chance tap on an existing one, e.g., Solr UI) to permit customers to glean admission to your draw in a straightforward methodology Write five queries, glean their results, and measure the flee of the querying Quiz 3: Uncover some innovations for bettering the indexing and ranking. Display why they’re predominant to resolve particular problems, illustrated with examples. You may per chance per chance per chance list one thing that has helped bettering your draw from the first model to the final one, plus queries that did no longer work earlier nonetheless now work as a result of enhancements you made. That it’s seemingly you’ll per chance per chance imagine innovations comprise (nonetheless are no longer restricted to) the next: Timeline search (e.g., allow user to head attempting internal particular time dwelling windows) Geo-spatial search (e.g., disclose design files to refine quiz results and enhance visualization) Enhanced search (e.g., add histograms, pie charts, note clouds, and so forth.) Interactive search (e.g., refine search finally ends up per customers’ relevance suggestions) Multimodal search (e.g., implement image or video retrieval) Multilingual search (e.g., enable your draw to retrieve files in more than one languages) Multifaceted search (e.g., visualize files per completely different classes) 3.3 Classification (40 functions) Even even if most incessantly defined as a binary categorization subject, sentiment prognosis is de facto a elaborate process, or suitcase research subject, because it requires tackling many other subtasks. Want at least two subtasks to make files extraction for your crawled files. Unless you are determined that your files would no longer comprise any neutral whine material, you have to tranquil continually camouflage at least subjectivity detection and polarity detection. Namely, you have to tranquil first categorize your files as neutral versus opinionated after which classify the resulting opinionated files as determined versus negative. Heaps of classification approaches will most seemingly be applied, collectively with: files basically based, e.g., SenticNet rule basically based, e.g., linguistic patterns machine studying basically based, e.g., deep neural networks hybrid (a combination of any of the above) You may per chance per chance per chance tap into any resource or toolkit you worship, as lengthy as you encourage your choices and you’re ready to critically analyze bought results. Some choices comprise: Weka: https://cs.waikato.ac.nz/ml/weka Hadoop: https://hadoop.apache.org Pylearn2: https://pylearn2.readthedocs.io/en/most new SciKit: https://scikit-be taught.org NLTK: https://nltk.org Theano: https://github.com/Theano Keras: https://github.com/fchollet/keras Tensorflow: https://github.com/tensorflow/tensorflow PyTorch: https://pytorch.org Huggingface: https://huggingface.co/ AllenNLP: https://github.com/allenai/allennlp Quiz 4: Compose the next projects: Motivate the replace of your classification manner in relation with the say-of-the-art Focus on about whether or no longer you needed to preprocess files (e.g., microtext normalization) and why Hold an evaluation dataset by manually labeling at least 1,000 files with an inter-annotator settlement of at least 80% (it is strongly instructed to comprise 3 annotators, nonetheless 2 will most seemingly be OK) Provide evaluation metrics corresponding to precision, recall, and F-measure on such dataset Compose a random accuracy test on the comfort of the tips and discuss about results Focus on about performance metrics, e.g., files classified per 2d, and scalability of the draw Quiz 5 Uncover some innovations for bettering classification. Whenever you happen to introduce greater than one, make an ablation stumble on to illustrate the contribution of every and each innovation. As an illustration, if you make note sense disambiguation (WSD) and named entity recognition (NER) to toughen sentiment prognosis, demonstrate the lengthen in accuracy when adding only WSD, then the lengthen in accuracy when adding only NER, and the lengthen in accuracy when adding both WSD and NER to your draw. Display why they’re predominant to resolve particular problems, illustrated with examples. That it’s seemingly you’ll per chance per chance imagine innovations comprise (nonetheless are no longer restricted to) the next: Enhanced classification (add one other sentiment prognosis subtask, e.g., sarcasm detection) Magnificent-grained classification (e.g., make facet-basically based sentiment prognosis) Hybrid classification (e.g., be aware both symbolic and subsymbolic AI) Cognitive classification (e.g., disclose brain-inspired algorithms) Multitask classification (e.g., make two sentiment prognosis projects collectively) Ensemble classification (e.g., disclose stacked ensemble) 4 Submission Opt one person of the neighborhood guilty of submitting the final assignment document. Submission has to be performed by Blackboard (attain no longer email your document). The outcomes may per chance no longer be presented in person so kind determined your document is obvious and self-contained. Alternatively, you attain comprise the probability to present your work by a short YouTube video. In actuality, if you search SC4021 on YouTube, it’s seemingly you’ll per chance per chance peek some examples from past years. You may per chance per chance per chance set terminate inspiration from them nonetheless doing a project that is greater than 30% the same to past (and present) ones will pause in annulment of your assignment. The submission shall consist of 1 single PDF file named after your neighborhood amount, e.g., if you are neighborhood 10, your file ought to tranquil be titled merely “10.pdf”. Failing to title the file precisely or sending it in the contaminated structure, e.g., zip or MS Discover, will pause in demerit functions. Enact add some photos to your document to kind it clearer and more straightforward to be taught. There is no page restrict and no particular formatting is required. The file shall comprise the next five key items: The names of the neighborhood participants + matriculation amount in the first page Your solutions to your complete above questions A YouTube link to a video presentation of up to 5 minutes: in the video, introduce your neighborhood participants and their roles, show the functions and the affect of your work and spotlight, if any, the ingenious parts of your work (show that you just attain no longer deserve to provide your complete solutions in the video presentation) A Dropbox (or identical, e.g., Google Drive or OneDrive) link to a compressed (e.g., zip) file with crawled text files, queries and their results, evaluation dataset, automatic classification results, and each other files for Questions 3 and 5 A Dropbox (or identical, e.g., Google Drive or OneDrive) link to a compressed (e.g., zip) file with all of your source codes and libraries, with a README file that explains the vogue to bring collectively and flee the source codes Write My Assignment Relate Plagiarism-free SC4021 Data Retrieval Neighborhood Assigment Answer Native Singapore Writers Crew 100% Plagiarism-Free Essay Highest Satisfaction Rate Free Revision On-Time Shipping
- WE OFFER THE BEST CUSTOM PAPER WRITING SERVICES. WE HAVE DONE THIS QUESTION BEFORE, WE CAN ALSO DO IT FOR YOU.
- Assignment status: Already Solved By Our Experts
- (USA, AUS, UK & CA PhD. Writers)
- CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS
QUALITY: 100% ORIGINAL PAPER – NO ChatGPT.NO PLAGIARISM – CUSTOM PAPER

Looking for unparalleled custom paper writing services? Our team of experienced professionals at AcademicWritersBay.com is here to provide you with top-notch assistance that caters to your unique needs.
We understand the importance of producing original, high-quality papers that reflect your personal voice and meet the rigorous standards of academia. That’s why we assure you that our work is completely plagiarism-free—we craft bespoke solutions tailored exclusively for you.
Why Choose AcademicWritersBay.com?
- Our papers are 100% original, custom-written from scratch.
- We’re here to support you around the clock, any day of the year.
- You’ll find our prices competitive and reasonable.
- We handle papers across all subjects, regardless of urgency or difficulty.
- Need a paper urgently? We can deliver within 6 hours!
- Relax with our on-time delivery commitment.
- We offer money-back and privacy guarantees to ensure your satisfaction and confidentiality.
- Benefit from unlimited amendments upon request to get the paper you envisioned.
- We pledge our dedication to meeting your expectations and achieving the grade you deserve.
Our Process: Getting started with us is as simple as can be. Here’s how to do it:
- Click on the “Place Your Order” tab at the top or the “Order Now” button at the bottom. You’ll be directed to our order form.
- Provide the specifics of your paper in the “PAPER DETAILS” section.
- Select your academic level, the deadline, and the required number of pages.
- Click on “CREATE ACCOUNT & SIGN IN” to provide your registration details, then “PROCEED TO CHECKOUT.”
- Follow the simple payment instructions and soon, our writers will be hard at work on your paper.
AcademicWritersBay.com is dedicated to expediting the writing process without compromising on quality. Our roster of writers boasts individuals with advanced degrees—Masters and PhDs—in a myriad of disciplines, ensuring that no matter the complexity or field of your assignment, we have the expertise to tackle it with finesse. Our quick turnover doesn’t mean rushed work; it means efficiency and priority handling, ensuring your deadlines are met with the excellence your academics demand.
