Project

General

Profile

Nandini Bansal's activity

From 10/09/2021 to 11/07/2021

11/03/2021

12:54 PM RK-A Task #1858 (Resolved): Extending the variation_slash function in BR3_IR3_tagger.py
In the C-API book, we saw KPs like "python/c" even when "python/c api" was present in the entirety. The reason is tha... Nandini Bansal
11:17 AM RK-A Task #1857 (Resolved): Extending the remove_header_by_adjective method to improvise the quality of KPs
In both C-API & Lib Ref book, there are headers like "other methods", "other functions", "other objects" present whic... Nandini Bansal
08:39 AM RK-A Feature #1856 (Rejected): Penalise the multi-word KP based on "Past" tense & vector similarity score
In filter_by_past_tense, there is a condition which checks whether the KP is multi-word. If it is, we don't check the... Nandini Bansal

11/02/2021

01:19 PM RK-A Bug #1848 (Resolved): Remove KPs with unwanted word in "past" tense
In the Library Reference book, there are some cases of KPs which included unwanted words in the past tense. In the ge... Nandini Bansal

11/01/2021

11:14 AM RK-A Bug #1834 (Resolved): Repeating words in the KP
One case has been identified of the KP in the Library Reference book where the KP is "attribute attribute". It has be... Nandini Bansal
05:25 AM RK-A Feature #1593 (Rejected): Eliminate certain header variants generated from variations_in_common_section_words
Nandini Bansal

10/28/2021

11:01 AM RK-A Feature #1817 (In Progress): Update the similar document score string for KPs matching with different header variants
Nandini Bansal
11:01 AM RK-A Feature #1817 (In Progress): Update the similar document score string for KPs matching with different header variants
For KPs, it could be possible that more than one subsection headers are similar to a KP, we should make such changes ... Nandini Bansal
06:32 AM RK-A Feature #1766 (Resolved): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
Nandini Bansal
06:32 AM RK-A Bug #1765 (Resolved): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
Nandini Bansal
05:50 AM RK-A Feature #1815 (In Progress): Implement a new context algorithm for the KPs matching with <word1.word2> subsection headers
In continuation of #1810
We need to implement a new context matching algorithm for the KPs matching with <word1.wo...
Nandini Bansal

10/27/2021

10:47 AM RK-A Feature #1813 (In Progress): Analysis for improvising the algorithms
Nandini Bansal
10:44 AM RK-A Feature #1813 (In Progress): Analysis for improvising the algorithms
All the cases where the KP and header variant "word1.word2" were matching have been logged and saved in a text file. ... Nandini Bansal
10:46 AM RK-A Feature #1812 (In Progress): Analysis for improvising the algorithms
Nandini Bansal
10:44 AM RK-A Feature #1812 (Rejected): Analysis for improvising the algorithms
All the cases where the KP and header variant "word1.word2" were matching have been logged and saved in a text file. ... Nandini Bansal
10:46 AM RK-A Feature #1811 (In Progress): For matching "word1" in with the KP, make use of token_processing function
Nandini Bansal
10:42 AM RK-A Feature #1811 (Resolved): For matching "word1" in with the KP, make use of token_processing function
To check whether "word1" of header variant "word2.word1" is matching to the KP which is getting tagged or not, we sho... Nandini Bansal
10:46 AM RK-A Feature #1810 (In Progress): Implement the penalty algorithm for KPs matching with "word1.word2" header variants in update_similarity_with_context function
Nandini Bansal
10:36 AM RK-A Feature #1810 (Resolved): Implement the penalty algorithm for KPs matching with "word1.word2" header variants in update_similarity_with_context function
After all the difficulties faced during the implementation of the above penalty in the BR3_IR3_tagger.py in the *gene... Nandini Bansal
10:46 AM RK-A Feature #1809 (In Progress): Checking the context of the KP matched with "word1.word2" header variant
Nandini Bansal
10:30 AM RK-A Feature #1809 (Resolved): Checking the context of the KP matched with "word1.word2" header variant
We have generally observed if the KP is matching with the "word1.word2" header variant with KP being equivalent to "w... Nandini Bansal

10/26/2021

12:03 PM RK-A Feature #1807 (Resolved): Add a functionality in the ADR to log the KPs where sim_scores are changing
As of now, if the KPs only have sim_score changes, the ADR does not log them but that shouldn't be the case. We need ... Nandini Bansal
06:07 AM RK-A Feature #1805 (Resolved): Remove KPs starting with numbers in words
In the C-API book, I have seen some cases where the KPs are wrongly starting with numbers in words like "one position... Nandini Bansal

10/21/2021

10:18 AM RK-A Bug #1777 (Resolved): Correct processing of some headers to generate appropriate header variants
It was observed in the C-API book that header variants are being generated inappropriately for some headers like "int... Nandini Bansal

10/20/2021

08:54 AM RK-A Bug #1751 (Closed): ADR includes rows with "New Added Docs" but the additions look wrong
Nandini Bansal
08:54 AM RK-A Bug #1745 (Closed): Skip all the KPs starting with words within CW 300
Nandini Bansal
08:53 AM RK-A Bug #1742 (Resolved): Preventing removal of KPs that are getting removed due to new/latest changes
Nandini Bansal
08:53 AM RK-A Bug #1742 (In Progress): Preventing removal of KPs that are getting removed due to new/latest changes
Nandini Bansal
08:52 AM RK-A Bug #1712 (Closed): Discarding bad KPs due to uncommon word at the beginning or end
Nandini Bansal
08:52 AM RK-A Bug #1710 (Closed): Skip tokens that are URLs and file paths from tagging
Nandini Bansal
08:50 AM RK-A Feature #1708 (Closed): For KPs located very closely, pick the one which is most similar & add a wrapper for lemmatisation to handle some exception cases
Nandini Bansal
08:50 AM RK-A Bug #1670 (Closed): Fix the bug in saving has_noun dictionary values
Nandini Bansal
08:50 AM RK-A Task #1653 (Closed): Adding all the header variants generated by variation_middle_parenthesis to processed_full_header with fullness_ratio 1.0
Nandini Bansal
08:50 AM RK-A Bug #1644 (Closed): Testing Change: Modify the method of header variants generation using variations_in_common_section_words
Nandini Bansal
08:41 AM RK-A Task #1643 (Closed): Modification in update_doc_id_score_list from tagging_utils.py such that for doc_ids with scores same as self links are reduced by 0.05
Nandini Bansal
08:41 AM RK-A Bug #1616 (Closed): Some unwanted removals of KPs starting with VBG/IN
Nandini Bansal
08:41 AM RK-A Feature #1615 (Closed): Add the str_between of variation_middle_parenthesis to processed_full_header list
Nandini Bansal
08:40 AM RK-A Bug #1613 (Closed): Removing bad KPs which have comma and header variants also have comma
Nandini Bansal
08:40 AM RK-A Bug #1599 (Closed): Modification in variation_variable_declarations to change the return values of the function for some cases of headers
Nandini Bansal
08:40 AM RK-A Feature #1598 (Closed): Remove return datatype from headers with empty parenthesis
Nandini Bansal
08:38 AM RK-A Feature #1587 (Closed): Discard and redude the scores of KPs with apostrophe when the header variant does not contain it
Nandini Bansal
08:38 AM RK-A Support #1570 (Closed): Reduce the time taken by get_candidates_for_variant function after modifications for matching a hyphenated word with a header without hyphen
Nandini Bansal
08:37 AM RK-A Task #1561 (Closed): Modification in tagging_utils.py such that doc_ids with sim_score equal to the kp_doc_id are not removed
Nandini Bansal
08:37 AM RK-A Feature #1544 (Closed): Changes to match a token/word with hyphen with a header variant which does not have a hyphen but is exactly same
Nandini Bansal
08:37 AM RK-A Task #1543 (Closed): Refactor the save_candidates function from BR3_IR3_tagger.py
Nandini Bansal
08:36 AM RK-A Task #1522 (Closed): Changes to ensure that singular and plural forms of key phrases are also checked in 20K CW in extract_single_uncommon_words method
Nandini Bansal
08:36 AM RK-A Bug #1520 (Closed): Instead of removing entire key phrase starting with IN pos tag for all cases, we can process the keyphrase and discard just the word
Nandini Bansal
08:36 AM RK-A Bug #1515 (Closed): Bug in addition of P's in KP tagging
Nandini Bansal
08:35 AM RK-A Bug #1507 (Closed): Bad looking key phrases following the pattern: "word1, word2" while the header variant is "word1 word2"
Nandini Bansal
08:34 AM RK-A Bug #1506 (Closed): Analyse & Discard single-word KPs with "ADV" POS tag
Nandini Bansal
05:03 AM RK-A Bug #1767 (Resolved): Add processed subsection headers in the final processed_subsections dictionary irrespective of skip_header value
As investigated by Rohit, the subsection headers with skip_header = True is not being added to the processed_subsecti... Nandini Bansal

10/19/2021

12:48 PM RK-A Feature #1766 (In Progress): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
Nandini Bansal
12:48 PM RK-A Feature #1766 (Resolved): Refactoring and updating the logic of update_tags function to make universal use in mainKP and subKP tagging
We will have to update the update_tags function of the tagging_utils.py to make sure we can re-use the function for t... Nandini Bansal
12:42 PM RK-A Bug #1765 (In Progress): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
Nandini Bansal
12:42 PM RK-A Bug #1765 (Resolved): Updating the P of tagged KPs where subKP and mainKP are separated by symbols
It was noticed that for single word subKPs and mainKPs separated by symbols, there were some discrepancies in the way... Nandini Bansal

10/14/2021

01:41 PM RK-A Bug #1755 (Resolved): In partial_header_match, increase penalty for some KPs where start and end words are same as header variant
There are cases where word count of KP > word count of header variant and the uncommon word in the KP is VERB. The PO... Nandini Bansal
01:37 PM RK-A Bug #1754 (Closed): In partial_header_match, increase penalty for some KPs where start and end words are same as header variant
For some cases, the word count of KPs and header variant is the same but they have uncommon middle words which make t... Nandini Bansal
01:27 PM RK-A Bug #1753 (Closed): In partial_header_match, skip penalty for some KPs where start and end words are same as header variant
There were observed some cases where the KP is a proper subset of the header variant but it was penalized because the... Nandini Bansal
01:20 PM RK-A Bug #1752 (Resolved): In partial_header_match, reduce penalty for some KPs where start and end words are same as header variant
In partial_header_match, we have a filter after the generation of candidates where we penalize the KPs because they s... Nandini Bansal
08:17 AM RK-A Bug #1751 (Resolved): ADR includes rows with "New Added Docs" but the additions look wrong
Nandini Bansal
06:56 AM RK-A Bug #1751 (In Progress): ADR includes rows with "New Added Docs" but the additions look wrong
Nandini Bansal
06:56 AM RK-A Bug #1751 (Closed): ADR includes rows with "New Added Docs" but the additions look wrong
Need to verify the code for generation of ADR to understand why there are rows showing "New Docs Added" when the anno... Nandini Bansal

10/13/2021

10:26 AM RK-A Bug #1747 (In Progress): Add condition for "callbacks" in the lemmatization wrapper
"callback" & "callbacks" are not being lemmatised to the same root word allowing close by tagging of the KPs in the "... Nandini Bansal
07:17 AM RK-A Bug #1746 (Resolved): Checking the vector similarity and wordnet similarity of "options" and "optional" to unlink them
After the addition of the "CC" POS tag in the list of POS tags in the generate_candidates function, we saw that "opti... Nandini Bansal
07:10 AM RK-A Bug #1745 (Closed): Skip all the KPs starting with words within CW 300
Upon making changes in the generate_candidates function, we saw that a lot of KPs were getting tagged with the starti... Nandini Bansal
07:00 AM RK-A Bug #1744 (Resolved): Calculating the fullness_ratio of the header variants to decide a threshold for removal of header variants
Using the original headers, calculate the fullness_ratio of the header variants wrt the original headers to see if we... Nandini Bansal
06:53 AM RK-A Bug #1743 (Resolved): Checking singular and plural forms of the tmp_var from variations_in_common_section_words in common words list
An experimentation approach to further extend this task. Earlier we were only checking the tmp_var in the 4K CW list.... Nandini Bansal
06:41 AM RK-A Bug #1670 (Resolved): Fix the bug in saving has_noun dictionary values
Nandini Bansal
06:41 AM RK-A Feature #1708 (Resolved): For KPs located very closely, pick the one which is most similar & add a wrapper for lemmatisation to handle some exception cases
Nandini Bansal
06:41 AM RK-A Bug #1515 (Resolved): Bug in addition of P's in KP tagging
Nandini Bansal
06:40 AM RK-A Bug #1669 (In Progress): Getting rid of all dependency on Colab for the current code path
Nandini Bansal
06:12 AM RK-A Task #1726: Handling cases of bad header variants like "representation"
Estimate time increased as we are stuck with some cases that are difficult to manage Nandini Bansal
06:02 AM RK-A Bug #1742 (Resolved): Preventing removal of KPs that are getting removed due to new/latest changes
There are cases of KPs which are getting removed (most prominently in Library Reference) due to recent changes. It in... Nandini Bansal

10/11/2021

05:42 AM RK-A Task #1726 (Resolved): Handling cases of bad header variants like "representation"
In BR3_IR3_tagger.py, we have a function called *variations_in_common_section_words* that strips all the common words... Nandini Bansal
 

Also available in: Atom