<?xml version='1.0' encoding='iso-8859-2' ?>
<?xml-stylesheet type='text/css' href='/html/lista.css' ?>
<link rel='stylesheet' type='text/css' href='/html/lista.css' ?>
<DKA>
<identifier> 
	<URLOfDoc>https://dka.oszk.hu/078300/078357</URLOfDoc> 
	<Filename>csabai_istvan_adatintenziv.jpg</Filename>
        <Thumbnail>https://dka.oszk.hu/078300/078357/csabai_istvan_adatintenziv_kiskep.jpg</Thumbnail> 
</identifier>
<DKAtitle> 
	<MainTitle>Adatintenzív megközelítés a tudományokban</MainTitle>
	<UniformTitle>Adatintenzív megközelítés a tudományokban</UniformTitle>
</DKAtitle>
<creator> 
	<RoleOfCreator>létrehozó</RoleOfCreator> 
	<CreatorFamilyName>Csabai</CreatorFamilyName> 
        <CreatorGivenName>István</CreatorGivenName> 
        <CreatorInvert>N</CreatorInvert>
</creator>
<date>
        <Pevent>felvéve</Pevent>
        <PdateChar>2021-05-14</PdateChar>
        <Pdate>2021-05-14</Pdate>
</date>
<date>
        <Pevent>elérhető</Pevent>
        <PdateChar>2021-04-07</PdateChar>
        <Pdate>2021-04-07</Pdate>
        <PdateNote>Az előadás időpontja.</PdateNote>
</date>
<type>
        <NameOfType>prezentáció</NameOfType>
        <NameOfType>előadás</NameOfType>
</type>
<subcollection>
        <NameOfCollection>Prezentáció</NameOfCollection>
</subcollection>
<subcollection>
        <NameOfCollection>Könyvtártudomány - prezentáció</NameOfCollection>
</subcollection>
<subcollection>
        <NameOfCollection>Networkshop 2021</NameOfCollection>
</subcollection>
<source>
        <NameOfSource>Videotorium</NameOfSource>
        <URLOfSource>https://kifu.videotorium.hu/hu/recordings/42210</URLOfSource>
</source>
<rights>
        <OwnerOfRights>Csabai István</OwnerOfRights>
        <NoteOfRights>Jogvédett</NoteOfRights>
</rights>
<topic>
        <Topic>Számítástechnika, hálózatok</Topic>
        <Subtopic>Internetes technológia</Subtopic>
</topic>
<topic>
        <Topic>Számítástechnika, hálózatok</Topic>
        <Subtopic>Internet használat</Subtopic>
</topic>
<subject>
        <Keyword>tudomány</Keyword>
        <SubjectRefinement>tárgyszó/kulcsszó</SubjectRefinement>
</subject>
<subject>
        <Keyword>adat</Keyword>
        <SubjectRefinement>tárgyszó/kulcsszó</SubjectRefinement>
</subject>
<subject>
        <Keyword>adatfeldolgozás</Keyword>
        <SubjectRefinement>tárgyszó/kulcsszó</SubjectRefinement>
</subject>
<subject>
        <Keyword>információáramlás</Keyword>
        <SubjectRefinement>tárgyszó/kulcsszó</SubjectRefinement>
</subject>
<subject>
        <Keyword>adatbázis</Keyword>
        <SubjectRefinement>tárgyszó/kulcsszó</SubjectRefinement>
</subject>
<coverage>
        <CoverageKeyword>2021</CoverageKeyword>
        <CoverageRefinement>időszak</CoverageRefinement>
</coverage>
<description>
        <Caption>Adatintenzív megközelítés a tudományokban</Caption>
        <OCRText>Data-intensive approach in sciences 
ISTVAN CSABAI DEPARTMENT OF PHYSICS OF COMPLEX SYSTEMS ELTE EÖTVÖS LORÁND UNIVERSITY, BUDAPEST 
Acknowledgement: Ministry of Innovation and Technology NRDI Office, MILAB Artificial Intelligence National Laboratory Program, FIEK_16-1-2016-0005, 2020-4.1.1.-TKP2020, NVKP_16-1-2016-0004, H2020 VEO No. 874735. 
NETWORKSHOP 2021.04.07 
History of (machine) intelligence / data science 
World 
Model 
History of (machine) intelligence / data science 
World 
Model 
History of (machine) intelligence / data science 
Model 
Instruments 
World 
Natural intelligence 
Homo Sapiens: Technical Specifications  
CPU  100 GN (giga-neurons)  
7&#177;2 bit 
Pollack, I. The information of elementary auditory displays. J. Acoust. Soc. Amer., 1952, 24, 745-749. 
Clock frequency  4-32 Hz  
CPU cores  1 (male version), 2+ (female v.)  
CPU speed  0.1 Flops (floating point op. / sec)  
Memory (short term)  7 +/-2 bits  
Storage  1TB-2.5PB  
Power  20 W  
Camera  576Mpix, 24Hz  
Touch  Yes  
Display  No  
Speakers  Mono  
GPS  No  
WIFI  No  
Bluetooth  No  
2G/3G/4G/5G  No/No/No/No  
Latest version update 
100 000 BC 
Main Features : 
&#8226; Find food 
&#8226; Escape predators 
&#8226; Kill enemies 
&#8226; Find mate and reproduce 
History of (machine) intelligence / data science 
First "Data Science" 
Tabulae Rudolphinae (1627), 23 years, 
History of (machine) intelligence / data science 
Science - technology - science - technology... 
Prototype of modern "data science"
SLOAN DIGITAL SKY SURVEY: 
2.5 terapixel image - 300 million 640 fibers - galaxies - 5 optical bands 1 million spectra 
2.5 terapixel image - 300 million 640 fibers- 
2.5m 120Mp &#8211;&gt; 2.5Tp 5 years:10TB 
New issue: BIG DATA !!! 
CfA 1989: 1100 galaxies 
Huge data tables 


Scientific goals and researcher&#8217;s perspective 
Queries in data space: e.g. separate stars and galaxies 
petroMag_i &gt; 17.5 and (petroMag_r &gt; 15.5 or petroR50_r &gt; 2) and(petroMag_r &gt; 0 andg &gt; 0 and r &gt; 0 and i&gt; 0) and ( (petroMag_r extinction_r) &lt; 19.2 and (petroMag_r extinction_r &lt; (13.1 + (7/3) * (dered_g dered_r) + 4 * (dered_r 
dered_i) 4 * 0.18) ) and ( (dered_r dered_i (dered_g dered_r)/4 0.18) &lt; 0.2) and ( (dered_r dered_i (dered_g dered_r)/4 0.18) &gt; 0.2) and ( (petroMag_r extinction_r + 2.5 
* LOG10(2 * 3.1415 * petroR50_r * petroR50_r)) &lt; 24.2) ) or ( (petroMag_r extinction_r &lt; 19.5) and ( (dered_r dered_i (dered_g dered_r)/4 0.18) &gt; (0.45 4 * (dered_g dered_r)) ) and ( (dered_g dered_r) &gt; (1.35 + 
0.25 * (dered_r dered_i)) ) ) and ( (petroMag_r extinction_r + 
2.5 * LOG10(2 * 3.1415 * petroR50_r * petroR50_r) ) &lt; 23.3 ) ) 
New skills: Indexing, databases 
&#8226; SDSS data  "read through"~1 day 
&#8226; Astronomers should learn: Database programming, computer  geometry, search trees,... 
&#8226; Multidimensional-and spherical indexing 
Modern data science: same trends in biology, environmental sciences, social sciences, ...
Not only astronomy: genomics 
Sanger-sequencing First virus sequence 1977: .X174, 5386nt 
Nyitray Lászl, Pál Gábor: A biokémia és molekuláris biolgia alapjai (2013) 
30 years later: NGS, nanopore 
Moore's law in genomics 
Sequencing is getting cheaper. More (public) data available. 
(HGP) 1990&#8211;2003 2020 2030? 13 years / 2,7 billion USD Few days /  loss function optimization 
images -&gt; points in N dim space 
Loss = number of wrong categorizations (error) 
Complex systems &#8211; complex models 
To understand complex systems we need complex models 
Complex models, 2M+ parameters! 
We need 
&#8226; Huge amount of data to set up, constrain, parametrize the models 
&#8226; Powerful computers and clever algorithms 
Complex function regression: machine learning! 
AI Research, Education and Applications @ Ev University 
Dept. of Physics of Complex Systems 
&#8226; Genetics -&gt; antibiotics resistance 
Matamoros et al., Pataki et al. 2020. 
&#8226; Mobile sensors -&gt; Parkinson 
Pataki @DREAM, Laki et al. 2016 
&#8226; Mosquito images -&gt; vector borne diseases 
Pataki et al. Sci.Rep. 2021 
&#8226; Medical imaging -&gt; breast cancer 
Ribli et al. @DREAM, Sci. Rep. 2018 
&#8226; Weak lensing map -&gt; cosmology parameters 
Ribli et al. Nature Astro. 2018, MNRAS 2019 
&#8226; Explainable AI 
Ribli et al. in prep. 
&#8226; Control of aging related methylation networks 
Palla et al. subm. 
&#8226; Pathology images 
SOTE TKP collab. 
&#8226; Quantum ML 
&#8226; MSc, PhD courses 
Vector borne diseases: MosquitoAlert image deep learning 
"Zika, dengue, chikungunya, and yellow fever are all transmitted to humans by Ae. aegypti and Ae. Albopictus." 
F. Bartumeus et al. http://www.mosquitoalert.com/ 
False(?) negatives: 
False(?) positives: 
Pataki et al. Sci. Rep. 2021. 
Space weather : whistler detection Language of the genome 
Pollen monitoring 
Animal health 
Deep learning for colorectal cancer pathology 
Mammography with deep learning (Faster R-CNN ) 
&#8226; Digital Mammography DREAM challenge &#8226; 1200 participants 
&#8226; Dezső Ribli, best final result 
&#8226; the only solution with localization 
&#8226; AUC = 0.95 
&#8226; Publication: Nature Scientific Reports (2018) 
&#8226; 30-th most popular from 17000 articles 
&#8226; New collaborations with hospitals, clinics 
&#8226; more training data &#8226; open source plugin 
&#8226; steps towards licensing 
D. Ribli, A. Horváth, Z. Unger, P. Pollner, and I. Csabai. "Detecting and classifying lesions in mammograms with deep learning." Scientific reports (2018) 
Explainable AI: automatic classification enhancement 
Any sufficiently advanced technology is indistinguishable from magic. /Arthur C. Clarke/ 
Indeed, understanding the laws of mechanics made us able to build pyramids and cathedrals, based on the laws of thermodynamics the invention of the steam engine empowered us to cross oceans and continents and today we all have "seven-league boots" in our garages. Understanding electrodynamics and quantum mechanics brought us the transistor that is at the heart of the Internet and the modern "magic mirrors", the mobile phones. With the advancements of high throughput techniques we may be ready to tackle another frontier: life and intelligence at last, because it is the most sophisticated and complex. End of diseases, much longer healthy life,...? 
What miracles will the advancements of machine learning bring? And what kind of challenges? 
NEW PARADIGMS NEED NEW RESEARCHERS 
EDUCATION: We need new scientist who have professional skills both in their 
István Csabai 
ELTE Dept. of Physics of Complex Systems csabai@elte.hu http://complex.elte.hu/~csabai/</OCRText>
        <LanguageOfDocument>angol</LanguageOfDocument>
</description>
<relation>
        <NameOfRelation>Dengel Eszter: Az interfész és az információ tárolása</NameOfRelation>
        <URLOfRelation>https://dka.oszk.hu/060900/060989</URLOfRelation>
</relation>
<format>
        <FormatName>PowerPoint prezentáció</FormatName>
        <PageNumber>38</PageNumber>
        <NoteOfTechnology>Microsoft Office PowerPoint 2016</NoteOfTechnology>
        <Metadata>N</Metadata>
</format>
<format>
        <FormatName>PDF dokumentum</FormatName>
        <PageNumber>38</PageNumber>
        <Metadata>N</Metadata>
</format>
<format>
        <FormatName>HTML dokumentum</FormatName>
        <NoteOfTechnology>HTML 5 verzió</NoteOfTechnology>
        <Metadata>N</Metadata>
</format>
<quality>
        <FinestFormat>JPEG képállomány</FinestFormat>
        <MaxImageSize>770x433</MaxImageSize>
        <FinestResolution>72</FinestResolution>
        <ColorOfImage>színes</ColorOfImage>
        <CompressionQuality>közepesen tömörített</CompressionQuality>
</quality>
<note>
        <GeneralNote>Networkshop konferencia 2021</GeneralNote>
</note>
<status>
        <StatusOfRecord>KÉSZ</StatusOfRecord>
</status>
<operator>
        <RoleOfOperator>katalogizálás</RoleOfOperator>
        <NameOfOperator>Nagy Zsuzsanna</NameOfOperator>
</operator>
</DKA>