Nandini Bansal's activity
From 10/09/2021 to 11/07/2021
11/03/2021
- 12:54 PM RK-A Task #1858 (Resolved): Extending the variation_slash function in BR3_IR3_tagger.py
- In the C-API book, we saw KPs like "python/c" even when "python/c api" was present in the entirety. The reason is tha...
- 11:17 AM RK-A Task #1857 (Resolved): Extending the remove_header_by_adjective method to improvise the quality of KPs
- In both C-API & Lib Ref book, there are headers like "other methods", "other functions", "other objects" present whic...
- 08:39 AM RK-A Feature #1856 (Rejected): Penalise the multi-word KP based on "Past" tense & vector similarity score
- In filter_by_past_tense, there is a condition which checks whether the KP is multi-word. If it is, we don't check the...
11/02/2021
- 01:19 PM RK-A Bug #1848 (Resolved): Remove KPs with unwanted word in "past" tense
- In the Library Reference book, there are some cases of KPs which included unwanted words in the past tense. In the ge...
11/01/2021
- 11:14 AM RK-A Bug #1834 (Resolved): Repeating words in the KP
- One case has been identified of the KP in the Library Reference book where the KP is "attribute attribute". It has be...
- 05:25 AM RK-A Feature #1593 (Rejected): Eliminate certain header variants generated from variations_in_common_section_words
10/28/2021
- 11:01 AM RK-A Feature #1817 (In Progress): Update the similar document score string for KPs matching with different header variants
- 11:01 AM RK-A Feature #1817 (In Progress): Update the similar document score string for KPs matching with different header variants
- For KPs, it could be possible that more than one subsection headers are similar to a KP, we should make such changes ...
- 06:32 AM RK-A Feature #1766 (Resolved): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
- 06:32 AM RK-A Bug #1765 (Resolved): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
- 05:50 AM RK-A Feature #1815 (In Progress): Implement a new context algorithm for the KPs matching with <word1.word2> subsection headers
- In continuation of #1810
We need to implement a new context matching algorithm for the KPs matching with <word1.wo...
10/27/2021
- 10:47 AM RK-A Feature #1813 (In Progress): Analysis for improvising the algorithms
- 10:44 AM RK-A Feature #1813 (In Progress): Analysis for improvising the algorithms
- All the cases where the KP and header variant "word1.word2" were matching have been logged and saved in a text file. ...
- 10:46 AM RK-A Feature #1812 (In Progress): Analysis for improvising the algorithms
- 10:44 AM RK-A Feature #1812 (Rejected): Analysis for improvising the algorithms
- All the cases where the KP and header variant "word1.word2" were matching have been logged and saved in a text file. ...
- 10:46 AM RK-A Feature #1811 (In Progress): For matching "word1" in with the KP, make use of token_processing function
- 10:42 AM RK-A Feature #1811 (Resolved): For matching "word1" in with the KP, make use of token_processing function
- To check whether "word1" of header variant "word2.word1" is matching to the KP which is getting tagged or not, we sho...
- 10:46 AM RK-A Feature #1810 (In Progress): Implement the penalty algorithm for KPs matching with "word1.word2" header variants in update_similarity_with_context function
- 10:36 AM RK-A Feature #1810 (Resolved): Implement the penalty algorithm for KPs matching with "word1.word2" header variants in update_similarity_with_context function
- After all the difficulties faced during the implementation of the above penalty in the BR3_IR3_tagger.py in the *gene...
- 10:46 AM RK-A Feature #1809 (In Progress): Checking the context of the KP matched with "word1.word2" header variant
- 10:30 AM RK-A Feature #1809 (Resolved): Checking the context of the KP matched with "word1.word2" header variant
- We have generally observed if the KP is matching with the "word1.word2" header variant with KP being equivalent to "w...
10/26/2021
- 12:03 PM RK-A Feature #1807 (Resolved): Add a functionality in the ADR to log the KPs where sim_scores are changing
- As of now, if the KPs only have sim_score changes, the ADR does not log them but that shouldn't be the case. We need ...
- 06:07 AM RK-A Feature #1805 (Resolved): Remove KPs starting with numbers in words
- In the C-API book, I have seen some cases where the KPs are wrongly starting with numbers in words like "one position...
10/21/2021
- 10:18 AM RK-A Bug #1777 (Resolved): Correct processing of some headers to generate appropriate header variants
- It was observed in the C-API book that header variants are being generated inappropriately for some headers like "int...
10/20/2021
- 08:54 AM RK-A Bug #1751 (Closed): ADR includes rows with "New Added Docs" but the additions look wrong
- 08:54 AM RK-A Bug #1745 (Closed): Skip all the KPs starting with words within CW 300
- 08:53 AM RK-A Bug #1742 (Resolved): Preventing removal of KPs that are getting removed due to new/latest changes
- 08:53 AM RK-A Bug #1742 (In Progress): Preventing removal of KPs that are getting removed due to new/latest changes
- 08:52 AM RK-A Bug #1712 (Closed): Discarding bad KPs due to uncommon word at the beginning or end
- 08:52 AM RK-A Bug #1710 (Closed): Skip tokens that are URLs and file paths from tagging
- 08:50 AM RK-A Feature #1708 (Closed): For KPs located very closely, pick the one which is most similar & add a wrapper for lemmatisation to handle some exception cases
- 08:50 AM RK-A Bug #1670 (Closed): Fix the bug in saving has_noun dictionary values
- 08:50 AM RK-A Task #1653 (Closed): Adding all the header variants generated by variation_middle_parenthesis to processed_full_header with fullness_ratio 1.0
- 08:50 AM RK-A Bug #1644 (Closed): Testing Change: Modify the method of header variants generation using variations_in_common_section_words
- 08:41 AM RK-A Task #1643 (Closed): Modification in update_doc_id_score_list from tagging_utils.py such that for doc_ids with scores same as self links are reduced by 0.05
- 08:41 AM RK-A Bug #1616 (Closed): Some unwanted removals of KPs starting with VBG/IN
- 08:41 AM RK-A Feature #1615 (Closed): Add the str_between of variation_middle_parenthesis to processed_full_header list
- 08:40 AM RK-A Bug #1613 (Closed): Removing bad KPs which have comma and header variants also have comma
- 08:40 AM RK-A Bug #1599 (Closed): Modification in variation_variable_declarations to change the return values of the function for some cases of headers
- 08:40 AM RK-A Feature #1598 (Closed): Remove return datatype from headers with empty parenthesis
- 08:38 AM RK-A Feature #1587 (Closed): Discard and redude the scores of KPs with apostrophe when the header variant does not contain it
- 08:38 AM RK-A Support #1570 (Closed): Reduce the time taken by get_candidates_for_variant function after modifications for matching a hyphenated word with a header without hyphen
- 08:37 AM RK-A Task #1561 (Closed): Modification in tagging_utils.py such that doc_ids with sim_score equal to the kp_doc_id are not removed
- 08:37 AM RK-A Feature #1544 (Closed): Changes to match a token/word with hyphen with a header variant which does not have a hyphen but is exactly same
- 08:37 AM RK-A Task #1543 (Closed): Refactor the save_candidates function from BR3_IR3_tagger.py
- 08:36 AM RK-A Task #1522 (Closed): Changes to ensure that singular and plural forms of key phrases are also checked in 20K CW in extract_single_uncommon_words method
- 08:36 AM RK-A Bug #1520 (Closed): Instead of removing entire key phrase starting with IN pos tag for all cases, we can process the keyphrase and discard just the word
- 08:36 AM RK-A Bug #1515 (Closed): Bug in addition of P's in KP tagging
- 08:35 AM RK-A Bug #1507 (Closed): Bad looking key phrases following the pattern: "word1, word2" while the header variant is "word1 word2"
- 08:34 AM RK-A Bug #1506 (Closed): Analyse & Discard single-word KPs with "ADV" POS tag
- 05:03 AM RK-A Bug #1767 (Resolved): Add processed subsection headers in the final processed_subsections dictionary irrespective of skip_header value
- As investigated by Rohit, the subsection headers with skip_header = True is not being added to the processed_subsecti...
10/19/2021
- 12:48 PM RK-A Feature #1766 (In Progress): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
- 12:48 PM RK-A Feature #1766 (Resolved): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
- We will have to update the update_tags function of the tagging_utils.py to make sure we can re-use the function for t...
- 12:42 PM RK-A Bug #1765 (In Progress): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
- 12:42 PM RK-A Bug #1765 (Resolved): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
- It was noticed that for single word subKPs and mainKPs separated by symbols, there were some discrepancies in the way...
10/14/2021
- 01:41 PM RK-A Bug #1755 (Resolved): In partial_header_match, increase penalty for some KPs where start and end words are same as header variant
- There are cases where word count of KP > word count of header variant and the uncommon word in the KP is VERB. The PO...
- 01:37 PM RK-A Bug #1754 (Closed): In partial_header_match, increase penalty for some KPs where start and end words are same as header variant
- For some cases, the word count of KPs and header variant is the same but they have uncommon middle words which make t...
- 01:27 PM RK-A Bug #1753 (Closed): In partial_header_match, skip penalty for some KPs where start and end words are same as header variant
- There were observed some cases where the KP is a proper subset of the header variant but it was penalized because the...
- 01:20 PM RK-A Bug #1752 (Resolved): In partial_header_match, reduce penalty for some KPs where start and end words are same as header variant
- In partial_header_match, we have a filter after the generation of candidates where we penalize the KPs because they s...
- 08:17 AM RK-A Bug #1751 (Resolved): ADR includes rows with "New Added Docs" but the additions look wrong
- 06:56 AM RK-A Bug #1751 (In Progress): ADR includes rows with "New Added Docs" but the additions look wrong
- 06:56 AM RK-A Bug #1751 (Closed): ADR includes rows with "New Added Docs" but the additions look wrong
- Need to verify the code for generation of ADR to understand why there are rows showing "New Docs Added" when the anno...
10/13/2021
- 10:26 AM RK-A Bug #1747 (In Progress): Add condition for "callbacks" in the lemmatization wrapper
- "callback" & "callbacks" are not being lemmatised to the same root word allowing close by tagging of the KPs in the "...
- 07:17 AM RK-A Bug #1746 (Resolved): Checking the vector similarity and wordnet similarity of "options" and "optional" to unlink them
- After the addition of the "CC" POS tag in the list of POS tags in the generate_candidates function, we saw that "opti...
- 07:10 AM RK-A Bug #1745 (Closed): Skip all the KPs starting with words within CW 300
- Upon making changes in the generate_candidates function, we saw that a lot of KPs were getting tagged with the starti...
- 07:00 AM RK-A Bug #1744 (Resolved): Calculating the fullness_ratio of the header variants to decide a threshold for removal of header variants
- Using the original headers, calculate the fullness_ratio of the header variants wrt the original headers to see if we...
- 06:53 AM RK-A Bug #1743 (Resolved): Checking singular and plural forms of the tmp_var from variations_in_common_section_words in common words list
- An experimentation approach to further extend this task. Earlier we were only checking the tmp_var in the 4K CW list....
- 06:41 AM RK-A Bug #1670 (Resolved): Fix the bug in saving has_noun dictionary values
- 06:41 AM RK-A Feature #1708 (Resolved): For KPs located very closely, pick the one which is most similar & add a wrapper for lemmatisation to handle some exception cases
- 06:41 AM RK-A Bug #1515 (Resolved): Bug in addition of P's in KP tagging
- 06:40 AM RK-A Bug #1669 (In Progress): Getting rid of all dependency on Colab for the current code path
- 06:12 AM RK-A Task #1726: Handling cases of bad header variants like "representation"
- Estimate time increased as we are stuck with some cases that are difficult to manage
- 06:02 AM RK-A Bug #1742 (Resolved): Preventing removal of KPs that are getting removed due to new/latest changes
- There are cases of KPs which are getting removed (most prominently in Library Reference) due to recent changes. It in...
10/11/2021
- 05:42 AM RK-A Task #1726 (Resolved): Handling cases of bad header variants like "representation"
- In BR3_IR3_tagger.py, we have a function called *variations_in_common_section_words* that strips all the common words...
Also available in: Atom