Geographic and Textual Data Fusion in Forostar.
CLEF LNCS, 2009
In this paper we provide some analysis of data fusion techniques employed at GeoCLEF 2008 to merge textual and geographic relevance. These methods are compared to our own experiments, where using our GIR system, Forostar, we show that an aggressive filter-based data fusion method can outperform a more sophisticated penalisation method.
@inproceedings{Overell:2008:GTD:1813809.1813936,
author = {Overell, Simon and Rae, Adam and R\"{u}ger, Stefan},
title = {Geographic and textual data fusion in Forostar},
booktitle = {Proceedings of the 9th Cross-language evaluation forum conference on
Evaluating systems for multilingual and multimodal information access},
series = {CLEF'08},
year = {2009},
isbn = {3-642-04446-8, 978-3-642-04446-5},
location = {Aarhus, Denmark},
pages = {838--842},
numpages = {5},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
}
Exploiting Term Co-occurrence for Enhancing Automated Image Annotation.
CLEF LNCS, 2009
This paper describes an application of statistical co-occurrence techniques that built on top of a probabilistic image annotation framework is able to increase the precision of an image annotation system. We observe that probabilistic image analysis by itself is not enough to describe the rich semantics of an image. Our hypothesis is that more accurate annotations can be produced by introducing additional knowledge in the form of statistical co-occurrence of terms. This is provided by the context of images that otherwise independent keyword generation would miss. We applied our algorithm to the dataset provided by ImageCLEF 2008 for the Visual Concept Detection Task (VCDT). Our algorithm not only obtained better results but also it appeared in the top quartile of all methods submitted in ImageCLEF 2008.
@inproceedings{Llorente:2008:ETC:1813809.1813902,
author = {Llorente, Ainhoa and Overell, Simon and Liu, Haiming and Hu, Rui and
Rae, Adam and Zhu, Jianhan and Song, Dawei and R\"{u}ger, Stefan},
title = {Exploiting term co-occurrence for enhancing automated image annotation},
booktitle = {Proceedings of the 9th Cross-language evaluation forum conference on
Evaluating systems for multilingual and multimodal information access},
series = {CLEF'08},
year = {2009},
isbn = {3-642-04446-8, 978-3-642-04446-5},
location = {Aarhus, Denmark},
pages = {632--639},
numpages = {8},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
}
MMIS at GeoCLEF 2008: Experiments in GIR.
CLEF Working Notes 2008, Aarhus, Denmark
In this paper we present our Geographic Information Retrieval System, Forostar, and the results of three experiments. We compare two data fusion methods, and show that a simple geographic filter outperforms a penalty based system. We compare context based disambiguation to a default gazetteer and show no significant difference. Finally we compare a unique geographic index to an ambiguous geographic index. The ambiguous index outperformed all other methods and was statistically significantly better than the baseline.
@inproceedings{overell08b,
title={MMIS at GeoCLEF 2008: Experiments in GIR.},
author={Simon Overell and Adam Rae and Stefan R\"uger},
year={2008},
month={September},
booktitle={CLEF 2008 Workshop, Working notes},
editor={Alessandro Nardi and Carol Peters},
location={Aarhus, Denmark}
}
MMIS at ImageCLEF 2008: Experiments combining Different Evidence Sources.
CLEF Working Notes 2008, Aarhus, Denmark
This paper presents the work of the MMIS group at ImageCLEF 2008. The results for three tasks are presented: Visual Concept Detection Task (VCDT), ImageCLEFphoto and ImageCLEFwiki. We combine image annotations, CBIR, textual relevance and a geographic filter using our generic data fusion method. We also compare methods for BRF and clustering. Our top performing method in the VCDT enhances supervised learning by modifying probabilities based on a matrix that shows how terms appear together. Although it occurred in the top quartile of submitted runs, the enhancement did not provide a statistically significant improvement. In the ImageCLEFphoto task we demonstrate that evidence from image retrieval can provide a contribution to retrieval; however we are yet to find a way of combining text and image evidence in a way to provide an improvement over the baseline. Due to the relative performances of difference evidences in ImageCLEFwiki and our failure to improve over a baseline we conclude that text is the dominant feature in this collection.
@inproceedings{overell08c,
title={MMIS at ImageCLEF 2008: Experiments combining Different Evidence Sources.},
author={Simon Overell and Ainhoa Llorente and Haiming Liu and Rui Hu and Adam Rae
and Jianhan Zhu and Dawei Song and Stefan R\"uger},
year={2008},
month={September},
booktitle={CLEF 2008 Workshop, Working notes},
editor={Alessandro Nardi and Carol Peters},
location={Aarhus, Denmark}
}
GIR Experiments with Forostar.
CLEF LNCS, 2008
In this paper we describe our Geographic Information Retrieval experiments with Forostar, our GIR application on the GeoCLEF 2007 corpus and query set. We compare the results from orthogonal text with no geographic entities and only geographic entities with standard text retrieval and combined text and geographic relevance methods. The text and named entity analysis and retrieval methods of Forostar are described in detail. We also detail our placename disambiguation and geographic relevance ranking methods. The paper concludes with an analysis of our results including significance testing where we show our baseline method, in fact, to be best. Finally we identify weaknesses in our approach and ways in which the system could be optimised and improved.
@inproceedings{overell08d,
publisher={Springer Berlin / Heidelberg},
volume={5152/2008},
booktitle={Advances in Multilingual and Multimodal Information Retrieval},
year={2008},
isbn={978-3-540-85759-4},
pages={856-863},
title={GIR Experiments with Forostar},
author={Simon Overell and Jo\~ao Magalh\~aes and Stefan R\"uger}
}
GIR experiments with Forostar at GeoCLEF 2007. (
Poster)
CLEF Working Notes 2007, Budapest
In this paper we describe our Geographic Information Retrieval experiments with Forostar, our GIR application on the GeoCLEF 2007 corpus and query set. We compare the results from orthogonal text with no geographic entities and only geographic entities with standard text retrieval and combined text and geographic relevance methods. The text and named entity analysis and retrieval methods of Forostar are described in detail. We also detail our placename disambiguation and geographic relevance ranking methods. The paper concludes with an analysis of our results including significance testing where we show our baseline method, in fact, to be best. Finally we identify weaknesses in our approach and ways in which the system could be optimised and improved.
@inproceedings{overell07e2,
title={GIR experiments with Forostar at GeoCLEF 2007.},
author={Simon Overell and Jo\~ao Magalh\~aes and Stefan R\"uger},
year={2007},
month={September},
booktitle={CLEF 2007 Workshop, Abstracts},
editor={Alessandro Nardi and Carol Peters},
ISBN={2-912335-31-0},
ISSN={1818-8044},
pages={51},
location={Budapest, Hungary}
}
Exploring Image, Text and Geographic Evidences in ImageCLEF 2007. (
Poster)
CLEF Working Notes 2007, Budapest
This year, ImageCLEF2007 data provided multiple evidences that can be explored in many different ways. In this paper we describe an information retrieval framework that combines image, text and non-geographic terms. Geographic analysis implements a placename disambiguation method and placenames are indexed by their Getty TGN Unique Id. Image analysis implements a query by semantic example model. The paper concludes with an analysis of our results. Finally we identify the weaknesses in our approach and ways in which the system could be optimised and improved.
@inproceedings{overell07d1,
title={Exploring Image, Text and Geographic Evidences in ImageCLEF 2007.},
author={Jo\~ao Magalh\~aes and Simon Overell and Stefan R\"uger},
year={2007},
month={September},
booktitle={CLEF 2007 Workshop, Working notes},
editor={Alessandro Nardi and Carol Peters},
ISBN={2-912335-32-9},
ISSN={1818-8044},
location={Budapest, Hungary}
}
@inproceedings{overell07d2,
title={Exploring Image, Text and Geographic Evidences in ImageCLEF 2007.},
author={Jo\~ao Magalh\~aes and Simon Overell and Stefan R\"uger},
year={2007},
month={September},
booktitle={CLEF 2007 Workshop, Abstracts},
editor={Alessandro Nardi and Carol Peters},
ISBN={2-912335-31-0},
ISSN={1818-8044},
pages={30},
location={Budapest, Hungary}
}
Forostar: A System for GIR.
CLEF LNCS, 2007
We detail our methods for generating and applying co-occurrence models for the purpose of placename disambiguation. We explain in detail our use of co-occurrence models for placename disambiguation using a model generated from Wikipedia. The presented system is split into two stages: a batch text & geographic indexer and a real time query engine. Four alternative query constructions and six methods of generating a geographic index are compared. The paper concludes with a full description of future work and ways in which the system could be optimised.
@inbook{overell2007c,
series={Lecture Notes in Computer Science},
publisher={Springer},
title={Forostar: A System for GIR.},
author={Simon Overell and Jo\~ao Magalh\~aes and Stefan R\"uger},
year={2007},
month={September},
booktitle={Evaluation of Multilingual and Multi-modal Information Retrieval},
editor={Carol Peters and Paul Clough and Fredric C. Gey and Jussi Karlgren
and Bernardo Magnini and Douglas W. Oard and Maarten de Rijke and Maximilian Stempfhuber},
volume={4730},
doi={10.1007/978-3-540-74999-8_119},
ISBN={978-3-540-74998-1},
ISSN={0302-9743},
location={Berlin / Heidelberg},
pages={930-937}
}
Imperial College and Johns Hopkins University at TRECVID. (
Poster)
TRECVid 2006, Gaithersburg
We describe our experiments for the high-level feature extraction and search tasks. For the search task, we tested the system we have used in previous years, which encapsulates content based image search, image browsing, automated image annotation and named entity extraction. For the feature task we apply the nonparametric density estimation model and the HMM-based concept specific image model.
@inproceedings{overell06c,
title={ {I}mperial {C}ollege and {J}ohns {H}opkins {U}niversity at {TRECVid}.},
author={Arnab Ghoshal and Sanjeev Khudanpur and Jo\~ao Magalh\~aes and Simon Overell
and Stefan R\"uger and Alexei Yavlinsky},
year={2006},
month={November},
booktitle = {TRECVid 2006 -- Text REtrieval Conference TRECVid Workshop},
location = {Gaithersburg, MD}
}
Place disambiguation with co-occurrence models.
CLEF Working Notes 2006, Alicante
In this paper we describe the geographic information retrieval system developed by the Multimedia & Information Systems team for GeoCLEF 2006 and the results achieved. We detail our methods for generating and applying co-occurrence models for the purpose of place name disambiguation, our use of named entity recognition tools and text indexing applications. The presented system is split into two stages: a batch text & geographic indexer and a real time query engine. The query engine takes manually crafted queries where the text component is separated from the geographic component. Two monolingual runs were submitted for the GeoCLEF evaluation, the first constructed from the title and description, the second included the narrative also. We explain in detail our use of co-occurrence models for place name disambiguation using a model generated from Wikipedia. The paper concludes with a full description of future work and ways in which the system could be optimised.
@inproceedings{overell06b1,
title={Place disambiguation with co-occurrence models.},
author={Simon Overell and Jo\~ao Magalh\~aes and Stefan R\"uger},
year={2006},
month={September},
booktitle={CLEF 2006 Workshop, Working notes},
editor={Alessandro Nardi and Carol Peters and Jose Luis Vicedo},
ISBN={2-912335-23-x},
ISSN={1818-8044},
location={Alicante, Spain}
}
@inproceedings{overell06b2,
title={Place disambiguation with co-occurrence models.},
author={Simon Overell and Jo\~ao Magalh\~aes and Stefan R\"uger},
year={2006},
month={September},
booktitle={CLEF 2006 Workshop, Abstracts},
editor={Alessandro Nardi and Carol Peters and Jose Luis Vicedo},
ISBN={2-912335-23-3},
ISSN={1818-8044},
pages={59},
location={Alicante, Spain}
}