Our Services

Get 15% Discount on your First Order

[rank_math_breadcrumb]

inverted Indexing

ITI220: Inverted Indexing Exercise

List Student Names in Group (and circle your name):

An inverted index, which is a common data structure in IR systems, is a list of all the index terms in a
collection with pointers to the documents in which each term occurs, as well as the frequency that a term
appears in the document and its location in the document.

Specifically, an inverted index consists of:

• A collection of postings lists – one associated with each unique term in the collection.
• Each posting list consists of a number of individual postings.
• Each posting holds a unique document identifier (docno) and the frequency (count) of the term in

a given document.

Let’s look at an example starting with the following famous lines from Shakespeare’s Merchant of Venice

Now, let’s treat each line (above) as if it were a “document.” The completed inverted index would look
something like this:

Let’s interpret one of the terms – if – in the inverted index:

The number directly after the term is its document frequency or df for short. The df specifies the
number of documents that contain this term. Since if appears in all four documents, its df is 4.
Although the df can be easily reconstructed by counting the number of postings, it is often explicitly stored
in the inverted index. The postings list contains a number of postings, each of which is a (docno, tf) tuple.
The docno is simply a unique identifier for the document (one through four, in this case). The tf, which
stands for term frequency, is the number of times the term appears in the document. The term if appears
once in every document. Typically, postings are sorted by ascending docno (as shown above).

Exercise #3 – Instructions

1. Get in small groups of 2-4. The instructor will distribute the “In-Class Activity: Exercise #3”
handout to each student. Put the names of your group at the top of the handout and circle
your name. Each student will complete the activity handout.

2. As a group, select ONE of the document collections (see below – circle your choice and the
group should select the same document collection) from Shakespeare’s famous quotes to
create an inverted index. Each line is ~ to a document. (Quotations are from famous William
Shakespeare Quotes,

3. Working together as a group, each member will complete “In-Class Activity Exercise #3” and
write an inverted index (on the handout) that would be built for the selected document
collection. Refer to the “If you prick us” inverted index example for guidance in creating your
group’s inverted index. Note: The terms are to be listed in alphabetical order.

4. As a group, discuss and write your response to the following question (in your handout): what did
you learn about indexing as it pertains to information retrieval from this exercise?

5. Submit your group’s inverted index (and the selected Shakespeare Document Collection) to the
instructor at the end of class. This exercise will be graded based on the “Information Retrieval
Exercises Rubric.”

Shakespeare Document Collection #1

1. doubt thou the stars are fire
2. doubt that the sun doth move
3. doubt truth to be a liar, but
4. never doubt I love

Shakespeare Document Collection #2

1. a fool thinks himself to be wise
2. but a wise man knows himself to be a fool

Shakespeare Document Collection #3

1. we know what we are, but
2. know not what we may be

Shakespeare Document Collection #4

1. when a father gives to his son, both laugh
2. when a son gives to his father, both cry

Shakespeare Document Collection #5

1. we know what we are
2. but know not what we may be

Shakespeare Document Collection #6

1. some are born great
2. some achieve greatness
3. and some have greatness thrust upon them

Group’s Response to the Following Question: What did you learn about indexing as it pertains to
information retrieval from this exercise?

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

lorem,ipsum,

   : Write the introduction for your =. The introduction should include a hook, background information, a thesis statement and the 3 main points that support your position. TOPIC: Should physical education be mandatory in schools? Thesis: physical education should not be mandatory in schools

Cultures in Conflict

Does the meeting of cultures raise issues about tradition, or about personal or cultural identity? Mother Tongue by Amy Tan- Choose one literary work and at least one work of visual art, and discuss where you see the intersections of cultures. In your initial posting, answer the following questions: ·

Language and Identity

How are language and identity connected?  How Languages Shape Thought- Mother Tongue- Choose at least two readings from this week, and reflect on the relationship between language and identity. In your initial posting, consider the following questions: · How are language and identity connected in the works? Consider both individual

Family and Gender Roles

How is family life, and customs, expectations, and gender roles within the family, depicted in modern day film? In this discussion, you will consider how family life, and customs, expectations, and gender roles within the family, are depicted in film. In your chosen clip, consider how the scene is shot

Critical Thinking

Why is a thesis statement important? Prior to beginning work on this assignment, review the instructions for the  Week 5 – Final Paper . For your final paper, you will analyze how one of the five themes discussed during the course is explored in at least two works from different

review

EVALUATION REPORT — ATTEMPT 1 — REVISION NEEDED Overall Evaluator Comments EVALUATOR COMMENTS Your work here identifies a business question for the scenario and understands why the decision tree is a good path forward. Your analysis shows your understanding of possible limitations for the data element and the decision tree.

640

Cabinet renovation \2.1 As you develop your project schedule and budget it is not unusual for something that you have never considered to impact your schedule and budget. How would handle such unplanned event(s) that could impact your schedule and budget. What is the critical path and as a project

642 db

2.1 Compare and contrast 2 tools/ techniques of the control quality process. 2.2 What are the challenges of incorporating quality management practices into a project team? 1. Quality Management and the Project Team What are the challenges of incorporating quality management practices into a project team? · . Why is

Topic and Outline

PSYC 775 Lecture: Topic and Outline Assignment Instructions Overview For the Lecture: Topic and Outline Assignment you will decide on your topic and complete an outline for your upcoming Lecture: Final Presentation Assignment. Before completing this assignment, thoroughly read the Lecture: Final Presentation Assignment Instructions so you know the guidelines

leture

Lecture: Critique Resources · Psychological Disorders Lecture by Professor Chris Grace of Biola University · Links to an external site. · · Intro to Psychology Lecture by Professor Paul Bloom of Yale · Links to an external site. · · Intro to Psychology Lecture by Professor Peter Ditto of UC

disnet

PSYC 775 Discussion Assignment Instructions The student will complete 2 Discussions in this course. The student will post one thread of at least 500–600 words by 11:59 p.m. (ET) on Thursday of the assigned Module: Week. The student must then post 2 replies of at least 150 words by 11:59

I need help d7

7.1 Post a link to a randomized control trial article; describe the intervention and the control condition and how much power the study had. Write a response to the example below aswell. Example:Impact of telelactation services on breastfeeding outcomes among Black and Latinx parents: protocol for the Tele-MILC randomized controlled

i need help sr3

For this assignment, you will read and critically reflect on a peer-reviewed article. The articles are posted on D2L under “Course Content”. This assignment is intended to expose you to literature that contributes to our understanding and practice of epidemiology; and then be able to synthesize and communicate the information.

English HEALTH REFORM PLAM ASSIGNMENT

HEALTH REFORM PLAN ESSAY You will create a fictitious health care reform plan based on what you have learned about U.S. and international health care programs throughout class. Write an essay of 1,000-1,250 words on an approach to health care reform and consider what reforms you would implement post COVID-19.

Religion and ethics

Select two of the situations above and then address 2 of the following: 1.  What is the relation between ethics and religion? Formulate and investigate the relation. 2.  For each case, determine the ethical path of conduct. Then, determine what paths of conduct would be unethical 3.  For each case,

Beauty Recommendations With Ai Tool

Experience a revolution in beauty shopping with Sofiqe.com, the premier UK site offering personalized beauty recommendations through an advanced AI tool. Discover our extensive range of affordable, high-quality makeup products including foundations, eyeshadows, concealers, and more, all tailored to your unique beauty profile. Start your journey to perfect makeup application

Research Question

 Using the document linked below, form a research question and begin preliminary research in a narrowed topic within your subject area. As with all assignments, include an appropriate header, follow standard formatting conventions, and revise your work thoroughly. 

i need help child

Discussion 7 – Disrupting Adultism Adultism refers to bias against children and youth; that is oppression, discrimination, and the unconscious assumption that adults are better than young people. It is prevalent in our society, while transition-age youth may experience it more; it exists for young people of all ages. ●