Less data and more brain

Read time: 5 mins

In a diagram published in 1970 showing the distribution of pulsars in the Galaxy, there are about fifty points. In the same year, an article on the optical identification of X sources discusses a handful of them: all those known at the time. If we compare the state of astronomy forty years ago with the current situation, we are especially struck by the tremendous amount of data that we have been accumulating at an increasingly frantic pace. Surveys of large areas of the sky, if not the whole sky, with increasing angular resolution and sensitivity, conducted in the main electromagnetic ranges, have produced an enormous amount of data which is far from being fully analyzed and used. Projects, just started or under way, continue to churn out the numbers of the Universe, byte after byte , pixel by pixel, and will continue to do so in the future. The concern is that the effort to acquire new data exceeds that aimed at extracting from those data all the information they contain. I developed this concern some years ago, following a medical examination, which led me to reflect on how the "we need more data" approach had become the preferred one when trying to solve our doubts and increase our knowledge.

“We need more data": a trend of our times

Years ago, a persistent shoulder pain convinced me to go the doctor for a checkup. After a short wait I found myself exposing my symptoms to a very polite person who asked me many things, except to undress and show her the painful part.

If I a nail were stuck in my shoulder she would not have noticed it. She did not look at it, she did not touch it, she did not ask me to perform any special movements. She did not deem it necessary or perhaps useful to know if it was red or blue, swollen or not. She simply prescribed me an X-ray:more data.I came back about a week later with the - thankfully negative - X-ray results and the scene repeated itself, similar but more rapid, and ended with the prescription of another specialist examination: an ultrasound scan. More data again.The diagnosis was made by the technician who performed the ultrasound: bursitis. This nice person, with an obvious long-term experience in the field, also suggested me the therapy: "Just wait until it goes away " I waited and after a few months the pain was gone.

I have the impression that this approach of wanting to have the results of extensive specialized tests before providing a diagnosis, is taking root, at least judging by the amount of analysis that are prescribed in the first instance, following the manifestation of some illness, as light and generic it may be. Often at the expense of reflection and the application of the Bayes' theorem which physicians of yore used to apply - probably unknowingly - when they made their diagnosis based on elements immediately and directly available.But wanting to use new data without having extracted everything possible from those available is not only a medical problem.

Improved use of data already available

Thanks to increasingly sophisticated equipment and the increase in active telescopes, the amount of information available to the experts doubles every twelve months. Moreover, while in the past the data acquired was almost always the exclusive property of the group that had obtained them, in recent years it is increasingly common for data to be included in a file accessible to all interested researchers (usually after a year or so, period during which the exclusive use by owners is guaranteed).

Nevertheless, the hunger for new data is always very high:for example, requests of telescopes observations exceed (by far) the time available. So much so that individual observation requests are made by groups and for programs that require many hundreds of hours of observation. Not to mention the projects to build new instruments to be used on the ground or in space. My impression is that more and more frequently the acquisition of new data is becoming a short cut (not in terms of cost or time, however) to get closer to solving a problem. In my opinion, this is an illusory alternative to the harder work of analysis of known data, which carries with it the responsibility of producing a "diagnosis". Demand for new data risks becoming an alternative to thinking, to squeezing every drop of information from data already available. My idea is confirmed by the growing number of archival research that is being proposed: this research uses precisely "old" data to address scientific issues other than those for which these data were originally - and by others - acquired.

Virtual astronomical observatories

To leverage these opportunities and facilitate access to the huge amount of data available in the various research institutions, virtual astronomical observatories have been developed worldwide (see for example and also

The essential consideration is that the amount of data doubles every year, while the computing power and network speed double, respectively, "only" every 18 to 20 months. Based on the Grid computing system the design of a virtual observatory aims at leaving data where they are (in the ESO, ESA, NASA etc.. archives), and at distributing the data processing in order to transfer the results of the analysis only. This approach aims to develop tools allowing for a real inter-operability of the various archives, capable of eating up huge amounts of data in a short time. In this way we can continue expanding our knowledge by using observations already available to the community.

I am going to close with a challenge. If the crisis and the general contraction of available resources force us to "close" telescopes and laboratories, postpone the construction of large facilities or even consider the possibility of cancelling projects under way, there is no need to tear our hair out or change job. Rather we can use what we already have. I bet that astronomy would still be striding forward.

Source:"Le Stelle" - n°101, December 2011

Aiuta Scienza in Rete a crescere. Il lavoro della redazione, soprattutto in questi momenti di emergenza, è enorme. Attualmente il giornale è interamente sostenuto dall'Editore Zadig, che non ricava alcun utile da questa attività, se non il piacere di fare giornalismo scientifico rigoroso, tempestivo e indipendente. Con il tuo contributo possiamo garantire un futuro a Scienza in Rete.

E' possibile inviare i contributi attraverso Paypal cliccando sul pulsante qui sopra. Questa forma di pagamento è garantita da Paypal.

Oppure attraverso bonifico bancario (IBAN: IT78X0311101614000000002939 intestato a Zadig srl - UBI SCPA - Agenzia di Milano, Piazzale Susa 2)

altri articoli

Pollution and Covid. Two vague clues don't make an evidence

In these days, newspapers and television programs (and the web, of course) are giving space to a statement by the Italian Society of Environmental Medicine (SIMA) announcing important discoveries on the link between airborne particulate matter and Coronavirus, even describing them as important for the decisions to be taken in the coming weeks.