Skip to main content

Behind the Scenes: Nicoletta Calzolari 

Submitted by Karina Berger on

 

In our 'Behind the Scenes' series, we introduce the people who work for and use our infrastructure. In the series, we feature pioneers, researchers, ambassadors, committee chairs, PhD students, and more. This month, one of CLARIN's pioneers, Nicoletta Calzolari, remembers the infrastructure's beginnings.   

 

Please introduce yourself. What is your background?

After graduating in Philosophy, I started my career at the University of Pisa with a national ministerial grant in computational linguistics, a completely new field to me. I liked it a lot.

At the beginning of the 1980s I was the first to start a quite new area of research – the acquisition of lexical information from machine-readable dictionaries – which soon became a trendy subject of research (both in academia and industry) on all continents. This awarded Pisa the coordination of the first European project, the ‘ESPRIT Basic Research Action, ACQUILEX’ (Acquisition of Lexical Knowledge for Natural Language Processing Systems) (1989-95), after which so many EC projects  followed! I introduced the notion of ‘reusability’ in the area of language resources (LR), which finally led to a series of important initiatives in Europe and world-wide, aimed at standardisation in the field, mostly coordinated by us at the ILC (Institute of Computational Linguistics).

I was thus one of the pioneers in the revolution of the 1980s, when LRs (i.e. linguistic data) began to be understood as critical to make steps forward in computational linguistics, prompting research that was at the time quite new: acquiring information from MRDs, instead of relying on linguists’ intuition, and so starting what became the ‘data-driven’ approach.

Later, I received an Honorary Doctorate in Philosophy from the University of Copenhagen. I was also awarded the title ACL Fellow (in the Founding Group) for ‘significant contributions to computational lexicography, and for the creation and dissemination of language resources’.

 

You are one of CLARIN’s pioneers. How did you first get involved?

With Antonio Zampolli, we initiated, in the 1980s, the quite new sector of ‘language resources, which soon became an established and very influential sector in our field. This has led to an impressive number of European and international projects, many of them coordinated in Pisa. We also introduced the notion of the ‘infrastructural’ role of language resources, which soon became widely recognised and was at the basis of many strategic activities from the 1990s, both in Europe and worldwide. An important initiative within this trend was in fact CLARIN. This led also to the establishment, in 1998, of the biennial international LREC Conference, of which I have been the General Chair since the edition in 2004.

 

 

'I was so much involved in LR research that I remember a meeting at the European Commission, where the EC organiser introduced me as ‘Madame LR’, because of my recognised leading position in the field!'

Nicoletta Calzolari

 

 

Before the beginning of CLARIN, I had many meetings with Peter Wittenburg (working until then in a different area), where we discussed many issues related to language resources, mostly of infrastructural nature. I still remember when Peter phoned me (I was travelling in a bus), very excited, saying that there was an opportunity for the idea of a new project and explained to me what it was … It would become, years later, CLARIN!

So I became one of the designers, with Peter and a few other colleagues/friends, of what then became CLARIN. 

  

What is one of your favourite memories of CLARIN’s conception/early days?

Informal brainstorming in Berlin in 2007.

Already in a note of 2005 we said: Our domain has a long tradition, and many recent initiatives mean that our domain is ready to create an integrated and interoperable domain for LRs based on persistent services. Such a domain will remove barriers for LR-based research in Europe and therefore improve its competitiveness.

This motivated us to write the EARL (European Archive for Language resources) proposal (precursor of CLARIN) and put it on the roadmap for European research infrastructures. And this small group of persons took the initiative to establish a network to take care of all relevant aspects of forming and establishing EARL. By the way, I found in old notes that we had other alternative names: DELAN (Distributed European Language resource Archive Network), ELAN (European Language resource Archive Network), LARA (LAnguage Resource Association) and ENAL (European Network of Archives for Language resources). As you see, CLARIN was not yet a name. The EARL was planned to start in 2007.

The menu with the signatures of all the 'wedding witnesses' where the 'marriage' between Steven Krauwer (representing CLARIN) and Nicoletta Calzolari (chair of the FLARENET Network) was celebrated. 

The preparation meetings were very interesting, with a small group of colleagues/friends aiming at shaping a best future for LRs in Europe! We had a long-term vision for the field.

Another nice feature of CLARIN has always been, as written in the 2008 grant agreement: ‘CLARIN has a strong representation of female scholars’. In fact, I was chairperson of the Scientific Board (SB) at that time, which did not only involve the operational work, but also included an influential role in strategic questions. The grant agreement noted that I was appointed ‘due to the relevance of these functions and the need to have a very well known person’, adding that ‘it is the role of the SB chairperson to take care that the careers of women are not disregarded or disadvantaged’.

I hope I did well in this respect. Certainly CLARIN did well!

A nice early memory is a dinner after a meeting in Pisa on 4 September 2008 at ‘il Connubio’, where jokingly we celebrated the marriage between Steven Krauwer (representing CLARIN) and myself (chair of the FLARENET Network). I still have the photo of the menu with the signatures of all the wedding witnesses (see left).

 

CLARIN is now more than 10 years old, and an established RI in SSH. Does this meet the expectations you had in 2012?

Even though CLARIN was officially established on February 29 2012, I must go back much before 2012, when it all started.

The time before CLARIN was officially started was very interesting. We had so many brainstorming ‘strategic’ meetings to set it up.  To write these few notes on the past, I looked at old notes on the establishment of CLARIN and between 2005 and 2009 I found records of at least 22 meetings! The first, to my knowledge, was in November 2005.

A new paradigm of R&D was emerging, pushing towards the creation of open and distributed language infrastructures for language resources and language technology, based on sharing LRs and tools. We were creating this new paradigm!

At that time, many infrastructural and networking initiatives/projects were - in different ways - aiming at creating an infrastructure of LRs: EAGLES/ISLE, CLARIN, FLaReNet (Fostering Language Resources Network), META-NET/ , to mention a few. All the activity around them was really exciting. These initiatives have all influenced how the future of our field would be shaped.

In a sense, the current CLARIN doesn’t meet my expectations at the time, but in a very positive way! I was afraid that the need of involving all the states would have complicated its life (probably influenced by the knowledge that in Italy it would be complicated), but it was/is a big success. Not easy to set up, but a success.

So I confess that CLARIN’s success has gone much beyond my expectations. I think this means that the time was right. But it also means that on the one hand, we started something that was needed, but also that it was led by very good people in a very effective and intelligent way.

Very motivated colleagues, starting from Peter!

 

Is there anything that has surprised you in CLARIN’s development? 

I was surprised, again in a very positive way, that CLARIN continued and still goes on since then in a very smooth and easy way. This despite the difficulty of so many partners, and in particular with so many countries involved, with different policies and needs.

I think it is due to a number of different factors, among which I can mention:

-   The people involved and their intelligence and dedication

-   The careful planning (many years before) and the very well-organised development

-   The real need for this type of action

-   The capacity of the oldest to involve, at the right time, a new generation of dedicated researchers.

And finally, and probably more important, this is due to having, since the beginning, a long-term and a broad-ranging perspective: CLARIN was conceived to be a way of making all European written and spoken heritage accessible for research by removing obstacles and facilitating the proper use of it. CLARIN was also specifically multilingual in collecting materials in all official and minority languages in Europe and from all historical periods.

  

How do you see CLARIN 10 years from now? 

I only hope that also for the next 10 years CLARIN will succeed to surprise me … going again beyond my expectations!

A completely new paradigm of R&D has emerged in the last few years, with the advent and the new trend of LLM and generative AI. This will mean a big change in our field. The new technologies may have an impact also on an infrastructure like CLARIN, and there may be the need to modify the original vision. This may be a challenge for CLARIN, but could also be a great opportunity.

It has always been my position that we should not just to follow the main scientific trends, but to influence – through our research, our competence, and the strength of our visions – new developments in the field, and thus establish new trends and set new research paradigms. This is a winning strategy on the ‘research market’.

I don’t know how CLARIN will be 10 years from now (and it’s not my task to say this, as an older-generation person in the field). But a group of young colleagues may have to start reconsidering and possibly reshaping (at least partially) the LRs of the future, consistent with the vision of the new paradigm, and consequently possibly reshaping the CLARIN mission, looking at how to position CLARIN in the new era. I think it would be important to have a small working group to start considering if the new research paradigm will require this type of work.

A final consideration is due, looking into the future. We must remember that technical/scientific issues are obviously important, but organisational, coordination, political, infrastructural issues play a major role in our field, as in every other field. Technologies exist and develop fast, as we see now more than ever, but the infrastructure that puts them together and sustains them for the research communities must also be rethought and probably reshaped to adapt to the new technological situation and, therefore, new needs. I think this is true even more now to prepare and be ready for the future.

And I wish to finish these notes with my wishes for a wonderful and brilliant future for CLARIN in the new era!