Several groups of French scientists have undertaken a systematic and random cDNA sequencing project on the model plant species Arabidopsis thaliana. A similar strategy is used as well by a consortium of American colleagues and efforts of the two consortia have made available more than eight thousand ESTs (Expressed Sequence Tags) which altogether probably correspond to approximately 3,000 unique protein genes out of the 20,000 which are supposed to be expressed during the life time of the plants. Several cDNA libraries have been constructed from etiolated seedlings, cell suspensions, young shoots, immature siliques, flower buds, wounded leaves and dry seeds. Clones were randomly selected and partially sequenced from 3' and 5' ends. The sequences were then sent to a database in Toulouse where the redundancy was evaluated. Each group attempted to identify these sequences by homology with sequences already stored in databases. More than 7000 sequences have been produced and approximately 4500 ESTs had been deposited at the EMBL database by the end of Nov. 1994. Over the last year 700 new unique protein genes have been tagged at both their 5' and 3' ends. The next steps are now to map some of these genes on recombinant inbred line and YACs and compare them with rice EST. cDNA sequencing already has applications in various fields of plant biology such as identifying members of multigene families and allowing much in depth analysis of metabolic pathways. The availability of both Arabidopsis and rice ESTs now allows accurate comparisons and provides a basis for mapping homologous genes in any crops. A final application is the genomic sequencing program in which we also participated. In a 30 kbp contig we identified at least 6 genes and found 4 very good matches with ESTS.