:: Home | Print ::

Katja Filippova - Ph.D. candidate

  • Contact Information
EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg
Germany

email: Lastname at eml-research.de
Tel.: +49 (0)6221 - 533 - 238
Fax: +49 (0)6221 - 533 - 298
 
 
 
 

Research Interests
  • generation and summarization
  • discourse coherence and information structure
My PhD is about sentence fusion, that is about how to generate a single novel sentence from a set of related sentences. The approach I undertake relies on dependency structures of sentences and, similar to existing ones, proceeds in two steps: (1) a new tree is generated; (2) this tree is linearized. It is not abstractive summarization yet but can be considered a first step towards it.

The main goal of my PhD research is to find a way to generate grammatical and readable sentences since this is what the existing approaches have reportedly had trouble with. If you are interested in the details, please have a look at the papers. The recent EMNLP paper is about how a new dependency tree is generated by compression from a graph of aligned words. The linearization algorithm is described in the ACL07 paper. The INLG paper explains how the method can be applied to sentence compression and presents the results on English data. Soon, I am going to put online the code and the data so that the experiments can be replicated.

Other topics I am interested in are unsupervized and language independent methods in NLP and, on the more linguistic side, cognitive and functional linguistics.

Publications

2009

  • Company Oriented Extractive Summarization of Financial News | Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 09). Athens, Greece, March 30 - April 3, 2009, to appear. (PDF)
    Katja Filippova, Mihai Surdeanu, Massimiliano Ciaramita and Hugo Zaragoza

2008

  • Sentence Fusion via Dependency Graph Compression | Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP 08). Honolulu, Hawaii, October 25-27, 2008, pp. 177-185. (PDF)
    Katja Filippova and Michael Strube

  • Dependency Tree Based Sentence Compression | International Natural Language Generation Conference (INLG 08). Salt Fork, Ohio, June 12-14, pp. 25-32. (PDF)
    Katja Filippova and Michael Strube

2007

  • German Vorfeld and Local Coherence | Special Issue on Coherence in Dialogue and Generation of Journal of Logic, Language, and Information (JoLLI). Volume 16(4), pp. 465-485. (Abstract)
    Katja Filippova and Michael Strube

  • Generating Constituent Order in German Clauses | Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 07). Prague, Czech Republic, June 23-25, 2007, pp. 320-327. (PDF)
    Katja Filippova and Michael Strube

  • Extending the Entity-grid Coherence Model to Semantically Related Entities | Proceedings of the 11th European Workshop on Natural Language Generation (ENLG 07). Schloss Dagstuhl, Germany, June 17-20, 2007, pp. 139-142. (PDF)
    Katja Filippova and Michael Strube

  • Cascaded Filtering for Topic-Driven Multi-Document Summarization | Proceedings of the 2007 Document Understanding Conference (DUC 07). Rochester, N.Y., April 22-27, 2007. (PDF)
    Katja Filippova, Margot Mieskes, Vivi Nastase, Simone Paolo Ponzetto and Michael Strube

2006

  • Improving Text Fluency by Reordering of Constituents | Proceedings of the ESSLLI Workshop on Modelling Coherence for Generation and Dialogue Systems. Malaga, Spain, July 31 - August 11, 2006, pp. 9-16. (PDF)
    Katja Filippova and Michael Strube

  • Using Linguistically Motivated Features for Paragraph Segmentation | Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 06). Sydney, Australia, July 22-23, 2006, pp. 267-274. (PDF)
    Katja Filippova and Michael Strube

2005

  • What Treebanks Can Do for You: Rule-based and Machine-learning Approaches to Anaphora Resolution in German | Proceedings of the 4th Workshop on Treebanks and Linguistic Theories (TLT 05). Barcelona, Spain, December 9-10, 2005, pp. 77-88. (PDF)
    Erhard Hinrichs, Katja Filippova and Holger Wunsch

  • A Data-driven Approach to Pronominal Anaphora Resolution in German | Proceedings of the 5th International Conference on Recent Advances in Natural Language Processing (RANLP 05). Borovets, Bulgaria, September 21-23, 2005, pp. 239-245. (PDF)
    Erhard Hinrichs, Katja Filippova and Holger Wunsch

Downloads

You can find the annotated corpus of Wikipedia biographies (WikiBiography) I am using here. Annotated articles look like that:



If you have any questions concerning the data, I would be glad to help.

About Me

Before coming to EML, I studied at school and at University in St.Petersburg (BA in Linguistics), for the MA degree in CL in Tübingen (under supervision of Prof.Dr. Erhard Hinrichs and Dr. Sandra Kübler). Currently, I am a PhD student of Prof.Dr. Elke Teich at TU Darmstadt.

In case you wonder what the best time to go to Petersburg is, I suggest June, late August-early September, and January. If you have to choose between Heidelberg and Tübingen, I would recommend visiting Heidelberg. Even if this information is of no use for you, well, you still have learned something about me.

Personal Interests

I am pretty much interested in everything, in particular, good books, good music, good (motion) pictures and opera. It is tempting to write more, so I might put a list with exact names and titles of what I consider "good".

Miscellaneous

A list of NLP terms related to summarization with Russian translations. Any comments are welcome.


<< back

© EML Research gGmbH