GRID, Облачные технологии, Большие Данные (Big Data) 1/98 Кореньков В.В. Директор ЛИТ ОИЯИ Зав. кафедры РИВС Университета «Дубна»

Презентация:



Advertisements
Похожие презентации
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Building a Simple Serial Network Understanding the OSI Model.
Advertisements

© 2006 Cisco Systems, Inc. All rights reserved. BCMSN v Introducing Campus Networks Network Requirements.
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Managing Your Network Environment Managing Cisco Devices.
"Cloud services" - what it is.. First of all – it is innovative online services. They provide an opportunity to use the enormous potential of the Internet.
Kurochkin I.I., Prun A.I. Institute for systems analysis of RAS Centre for grid-technologies and distributed computing GRID-2012, Dubna, Russia july.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Connecting Networks Understanding How TCP/IP Works.
Designing Virtual Private Networks © 2004 Cisco Systems, Inc. All rights reserved. Designing Site-to-Site VPNs ARCH v
Designing Network Management Services © 2004 Cisco Systems, Inc. All rights reserved. Designing the Network Management Architecture ARCH v
© 2003, Cisco Systems, Inc. All rights reserved. CSPFA Chapter 3 Cisco PIX Firewall Technology and Features.
1 Where is the O(penness) in SaaS? Make sure youre ready for the next wave … Jiri De Jagere Senior Solution Engineer, Progress Software Session 123.
© 2002 IBM Corporation Confidential | Date | Other Information, if necessary © Wind River Systems, released under EPL 1.0. All logos are TM of their respective.
The waterfall model is a popular version of the systems development life cycle model for software engineering. Often considered the classic approach to.
© 2006 Cisco Systems, Inc. All rights reserved. HIPS v Administering Events and Generating Reports Managing Events.
Kyiv National Linguistic University. Kyiv National Linguistic University is a higher education institution in Kiev, Ukraine. It was founded in 1948 as.
The waterfall model is a popular version of the systems development life cycle model for software engineering. Often considered the classic approach to.
Institute for Information Problems of the Russian academy of Sciences and its linguistic research Olga Kozhunova CML-2008, Becici, 6-13 September.
Novouralsk and space program Anastasia Bunkova Novouralsk 2007.
Internet – Global Network Internet – Global Network.
IP Transit via Russia: TTK Growing Capabilities Caspian Telecoms 2012, Istanbul, Turkey, 19 th April 2012 Konstantin Novikov Wholesale, CIS countries.
Транксрипт:

GRID, Облачные технологии, Большие Данные (Big Data) 1/98 Кореньков В.В. Директор ЛИТ ОИЯИ Зав. кафедры РИВС Университета «Дубна»

Mirco Mazzucato DUBNA Grids, clouds, supercomputers, etc. Ian Bird 2 Grids Collaborative environment Distributed resources (political/sociological) Commodity hardware (also supercomputers) (HEP) data management Complex interfaces (bug not feature) Supercomputers Expensive Low latency interconnects Applications peer reviewed Parallel/coupled applications Traditional interfaces (login) Also SC grids (DEISA, Teragrid) Clouds Proprietary (implementation) Economies of scale in management Commodity hardware Virtualisation for service provision and encapsulating application environment Details of physical resources hidden Simple interfaces (too simple?) Volunteer computing Simple mechanism to access millions CPUs Difficult if (much) data involved Control of environment check Community building – people involved in Science Potential for huge amounts of real work Many different problems: Amenable to different solutions No right answer Grids, clouds, supercomputers..

Концепция «Облачных вычислений» Все есть сервис (XaaS) AaaS: приложения как сервис PaaS: платформа как сервис SaaS: программное обеспечение как сервис DaaS: данные как сервис IaaS: инфраструктура как сервис HaaS: оборудование как сервис Воплощение давней мечты о компьютерном обслуживании на уровне обычной коммунальной услуги: масштабируемость оплата по реальному использованию (pay-as-you-go)

Что такое Cloud computing? Централизация IT ресурсов Виртуализация IT ресурсов Динамическое управление IT ресурсами Автоматизация IT процессов Повсеместный доступ к ресурсам Упрощение IT услуг Стандартизация IT инфраструктуры

Real World Problems Taking Us BEYOND PETASCALE 1 PFlops 100 TFlops 10 TFlops 1 TFlops 100 GFlops 10 GFlops 1 GFlops 100 MFlops SUM Of Top # Aerodynamic Analysis: Laser Optics: Molecular Dynamics in Biology: Aerodynamic Design: Computational Cosmology: Turbulence in Physics: Computational Chemistry: 1 Petaflops 10 Petaflops 20 Petaflops 1 Exaflops 10 Exaflops 100 Exaflops 1 Zettaflops Source: Dr. Steve Chen, The Growing HPC Momentum in China, June 30 th, 2006, Dresden, Germany Example Real World Challenges: Full modeling of an aircraft in all conditionsFull modeling of an aircraft in all conditions Green airplanesGreen airplanes Genetically tailored medicineGenetically tailored medicine Understand the origin of the universeUnderstand the origin of the universe Synthetic fuels everywhereSynthetic fuels everywhere Accurate extreme weather predictionAccurate extreme weather prediction 10 PFlops 100 PFlops 10 EFlops 1 EFlops 100 EFlops 1 ZFlops What we can just model today with

Reach Exascale by 2018 From GigFlops to ExaFlops Sustained TeraFlop Sustained PetaFlop S ustained GigaFlop Sustained ExaFlop The pursuit of each milestone has led to important breakthroughs in science and engineering. Source: IDC In Pursuit of Petascale Computing: Initiatives Around the World, 2007 ~1987 ~ ~2018 Note: Numbers are based on Linpack Benchmark. Dates are approximate.

7 Top 500 SiteSystem Cores Rmax (TFlop/s) Rpeak (TFlop/s) Power (kW) 1 National University of Defense TechnologyNational University of Defense Technology China Tianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1PTianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P NUDT DOE/SC/Oak Ridge National LaboratoryDOE/SC/Oak Ridge National Laboratory United States Titan - Cray XK7, Opteron C 2.200GHz, Cray Gemini interconnect, NVIDIA K20xTitan - Cray XK7, Opteron C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x Cray Inc DOE/NNSA/LLNLDOE/NNSA/LLNL United States Sequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, CustomSequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM RIKEN Advanced Institute for Computational Science (AICS) RIKEN Advanced Institute for Computational Science (AICS) Japan K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnectK computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu DOE/SC/Argonne National LaboratoryDOE/SC/Argonne National Laboratory United States Mira - BlueGene/Q, Power BQC 16C 1.60GHz, CustomMira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM Texas Advanced Computing Center/Univ. of Texas Texas Advanced Computing Center/Univ. of Texas United States Stampede - PowerEdge C8220, Xeon E C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P Stampede - PowerEdge C8220, Xeon E C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P Dell Forschungszentrum Juelich (FZJ)Forschungszentrum Juelich (FZJ) Germany JUQUEEN - BlueGene/Q, Power BQC 16C 1.600GHz, Custom InterconnectJUQUEEN - BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect IBM DOE/NNSA/LLNL DOE/NNSA/LLNL United States Vulcan - BlueGene/Q, Power BQC 16C 1.600GHz, Custom InterconnectVulcan - BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect IBM Leibniz Rechenzentrum Leibniz Rechenzentrum Germany SuperMUC - iDataPlex DX360M4, Xeon E C 2.70GHz, Infiniband FDRSuperMUC - iDataPlex DX360M4, Xeon E C 2.70GHz, Infiniband FDR IBM National Supercomputing Center in TianjinNational Supercomputing Center in Tianjin China Tianhe-1A - NUDT YH MPP, Xeon X5670 6C 2.93 GHz, NVIDIA 2050Tianhe-1A - NUDT YH MPP, Xeon X5670 6C 2.93 GHz, NVIDIA 2050 NUDT

« Грид - это система, которая: · координирует использование ресурсов при отсутствии централизованного управления этими ресурсами · использует стандартные, открытые, универсальные протоколы и интерфейсы. · обеспечивает высококачественное обслуживание» (Ian Foster: "What is the grid? ", 2002 г.) Концепция Грид Модели грид: Distributed Computing High-Throughput Computing On-Demand Computing Data-Intensive Computing Collaborative Computing Междисциплинарный характер грид: развиваемые технологии применяются в физике высоких энергий, космофизике, микробиологии, экологии, метеорологии, различных инженерных и бизнес приложениях. Виртуальные организации (VO)

ПРОМЕЖУТОЧНОЕ ПРОГРАММНОЕ ПРОМЕЖУТОЧНОЕ ПРОГРАММНОЕ Визуализация Рабочие станции Мобильный доступ Суперкомпьютеры, ПК- кластеры Интернет, сети ОБЕСПЕЧЕНИЕ ГРИДОБЕСПЕЧЕНИЕ ГРИД Массовая память, сенсоры, эксперименты Грид - это средство для совместного использования вычислительных мощностей и хранилищ данных посредством интернета

Korea and CERN / July Enter a New Era in Fundamental Science The Large Hadron Collider (LHC), one of the largest and truly global scientific projects ever built, is the most exciting turning point in particle physics. Exploration of a new energy frontier Proton-proton and Heavy Ion collisions at E CM up to 14 TeV Exploration of a new energy frontier Proton-proton and Heavy Ion collisions at E CM up to 14 TeV LHC ring: 27 km circumference TOTEM LHCf MOEDAL CMS ALICE LHCb ATLAS

Collision of proton beams… …observed in giant detectors Large Hadron Collider

GB/sec (ions)

13 Tier 0 – Tier 1 – Tier 2 13 Tier-0 (CERN): Data recording Initial data reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (>200 centres): Simulation End-user analysis

14/98 The Worldwide LHC Computing Grid (WLCG)

GRID as a brand-new understanding of possibilities of computers and computer networks foresees a global integration of information and computer resources. At the initiative of CERN, a project EU-dataGrid started up in January 2001 with the purpose of testing and developing advanced grid-technologies. JINR was involved with this project. The LCG (LHC Computing Grid) project was a continuation of the project EU-dataGrid. The main task of the new project was to build a global infrastructure of regional centres for processing, storing and analysis of data of physical experiments on the Large Hadron Collider (LHC) – Russian Consortium RDIG – Russian Data Intensive Grid – was established to provide a full-scale participation of JINR and Russia in the implementation of the LCG/EGEE project – The EGEE (Enabling Grid for E-Science) projects was started up. CERN is its head organization, and JINR is one of its executors – The EGI-InSPIRE project (Integrated Sustainable Pan-European Infrastructure for Researchers in Europe) 15/98

16 Country Normalized CPU time (2013) All Country - 12,797,893,444 Russia- 345,722,044 Job 446,829,340 12,836,045

CICC comprises 2582 Cores Disk storage capacity 1800 TB Availability and Reliability = 99% JINR Central Information and Computing Complex (CICC) JINR-LCG2 Tier2 Site JINR covers 40% of the RDIG share to the LHC ~ 18 million tasks were executed in RDIG Normalized CPU time (HEPSPEC06) per site ( ) Foreseen computing resources to be allocated for JINR CICC 2014 – CPU (HEPSPEC06) Disk storage (TB) Mass storage (TB)

Tier2 level grid-infrastructure to support the experiments at LHC (ATLAS, ALICE, CMS, LHCb), FAIR (CBM, PANDA) as well as other large-scale experiments; a distributed infrastructure for the storage, processing and analysis of experimental data from the accelerator complex NICA; a cloud computing infrastructure; a hybrid architecture supercomputer; educational and research infrastructure for distributed and parallel computing. 18/98 Multifunctional centre for data processing, analysis and storage

Collaboration in the area of WLCG monitoring The Worldwide LCG Computing Grid (WLCG) today includes more than 170 computing centers where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring of the LHC computing activities and of the health and performance of the distributed sites and services is a vital condition of the success of the LHC data processing For several years CERN (IT department) and JINR collaborate in the area of the development of the applications for WLCG monitoring: - WLCG Transfer Dashboard - Monitoring of the XRootD federations - WLCG Google Earth Dashboard - Tier3 monitoring toolkit

JINR distributed cloud grid-infrastructure for training and research Main components of modern distributed computing and data management technologies Scheme of the distributed cloud grid-infrastructure There is a demand in special infrastructure what could become a platform for training, research, development, tests and evaluation of modern technologies in distributed computing and data management. Such infrastructure was set up at LIT integrating the JINR cloud and educational grid infrastructure of the sites located at the following organizations: Institute of High-Energy Physics (Protvino, Moscow region), Bogolyubov Institute for Theoretical Physics (Kiev, Ukraine), National Technical University of Ukraine "Kyiv Polytechnic Institute" (Kiev, Ukraine), L.N. Gumilyov Eurasian National University (Astana, Kazakhstan), B.Verkin Institute for Low Temperature Physics and Engineering of the National Academy of Sciences of Ukraine (Kharkov,Ukraine), Institute of Physics of Azerbaijan National Academy of Sciences (Baku, Azerbaijan)

9/12/13 NEC Big Data Has Arrived at an Almost Unimaginable Scale Business s sent per year 3000 PBytes Content Uploaded to Facebook each year. 182 PBytes Google search index 98 PBytes Health Records 30 PBytes Youtube 15 PBytes LHC Annual 15 PBytes Climate Library of congress Nasdaq US census ATLAS Annual Data Volume 30 PBytes ATLAS Managed Data Volume 130 PBytes

Proposal titled Next Generation Workload Management and Analysis System for BigData – Big PanDA started in Sep 2012 ( funded DoE) Generalization of PanDA as meta application, providing location transparency of processing and data management, for HEP and other data-intensive sciences, and a wider exascale community. There are three dimensions to evolution of PanDA Making PanDA available beyond ATLAS and High Energy Physics Extending beyond Grid (Leadership Computing Facilities, High- Performance Computing, Google Compute Engine (GCE), Clouds, Amazon Elastic Compute Cloud (EC2), University clusters) Integration of network as a resource in workload management 9/12/13 NEC Evolving PanDA for Advanced Scientific Computing

8/6/13 Big Data Workshop23 Leadership Computing Facilities. Titan Slide from Ken Read

24 Tier1 center The Federal Target Programme Project: «Creation of the automated system of data processing for experiments at the LHC of Tier-1 level and maintenance of Grid services for a distributed analysis of these data» Duration: 2011 – 2013 Duration: 2011 – 2013 (official letter by Minister of Science and Education of Russia A. Fursenko has been sent to CERN DG R. Heuer): March Proposal to create the LCG Tier1 center in Russia (official letter by Minister of Science and Education of Russia A. Fursenko has been sent to CERN DG R. Heuer): NRC KI for ALICE, ATLAS, and LHC-B LIT JINR (Dubna) for the CMS experiment Full resources - in 2014 to meet the start of next working LHC session. September 2012 – Proposal was reviewed by WLCG OB and JINR and NRC KI Tier1 sites were accepted as a new Associate Tier1

25/98 JINR Tier1 Connectivity Scheme

26/98 JINR CMS Tier-1 progress Engineering infrastructure (system of uninterrupted power supply and climate- control); High-speed reliable network infrastructure with the allocated reserved channel to CERN (LHCOPN); Computing system and storage system on the basis of disk arrays and tape libraries of high capacity; 100% reliability and availability. 2012(done) CPU (HEPSpec06) Number of core Disk (Terabytes) Tape (Terabytes)

27/98 Lyon/CCIN2P3 Barcelona/PIC De-FZK US-FNAL Ca- TRIUMF NDGF CERN US-BNL UK-RAL Taipei/ASGC Amsterdam/NIKHEF-SARA Bologna/CNAF Russia: NRC KI JINR

28/98 Frames for Grid cooperation of JINR 28 Worldwide LHC Computing Grid (WLCG) EGI-InSPIRE Enabling Grids for E-sciencE (EGEE) - Now is EGI-InSPIRE RDIG Development Project BNL, ANL, UTA Next Generation Workload Management and Analysis System for BigData Tier1 Center in Russia (NRC KI, LIT JINR) 6 Projects at CERN BMBF grant Development of the grid-infrastructure and tools to provide joint investigations performed with participation of JINR and German research centers Development of grid segment for the LHC experiments was supported in frames of JINR-South Africa cooperation agreement; Development of grid segment at Cairo University and its integration to the JINR GridEdu infrastructure Development of grid segment at Cairo University and its integration to the JINR GridEdu infrastructure JINR - FZU AS Czech Republic Project The grid for the physics experiments JINR - FZU AS Czech Republic Project The grid for the physics experiments NASU-RFBR project Development and support of LIT JINR and NSC KIPT grid- infrastructures for distributed CMS data processing of the LHC operation NASU-RFBR project Development and support of LIT JINR and NSC KIPT grid- infrastructures for distributed CMS data processing of the LHC operation JINR-Romania cooperation Hulubei-Meshcheryakov programme JINR-Moldova cooperation (MD-GRID, RENAM) JINR-Mongolia cooperation (Mongol-Grid) Project GridNNN (National Nanotechnological Net)

29/98 Резюме 29 Подготовка высококвалифицированных ИТ-кадров -Современная инфраструктура (суперкомпьютеры, грид, cloud) для обучения, тренинга, выполнения проектов; -Центры передовых ИТ-технологий, на базе которых создаются научные школы, выполняются проекты высокого уровня в широкой международной кооперации с привлечением студентов и аспирантов; -Участие в мегапроектах (LHC, FAIR, NICA, PIC), в рамках которых создаются новые ИТ-технологии (WWW, Grid, Big Data); -Международные студенческие школы, летние практики на базе высокотехнологических организаций (ЦЕРН, ОИЯИ, НИЦ «Курчатовский институт»