September 22nd, 2016

356A Fitzpatrick

3:30 pm - 5:00 pm

Emu Solutions’ Migratory Thread and Memory-Side Processing Technology

 

Ken Jacobsen

CEO Emu Technology

 

Conventional computer design has changed little over the last fifty years. Almost all commercial computers employ CPUs connected to data caches, which are then connected to large memory subsystems. As the size of problems becomes large, many copies of these CPU-memory combinations are interconnected using some form of high speed network to compose a larger system. Such systems work extremely well, provided that a majority of accesses to memory hit in caches and that the vast majority of all references are for data in local memory, and therefore do not require moving data across the interconnecting network.

Unfortunately, with many Big Data problems, these conditions for efficient operation are not met. The data is too large to fit in a single memory, and, since the goal of data analytics is typically to detect and analyze relationships between data elements spread throughout the entire database, the targets of memory references are randomly distributed, and the vast majority of references go across the interconnect network. Conventional, cache-based computers rely on strong data locality for performance. The EMU system gains performance from weak locality, and has no reliance on data adjacency. Applications with large graphs or very sparse matrices, where data locality cannot be assumed, will continue to be problematic. Using a new paradigm based on Migratory Threads and Memory-side Processing, Emu effectively ‘brings the man to the mountain [of data]’, thereby avoiding the bandwidth and latency limitations that choke today’s HPC systems.

Emu’s revolutionary approach enables huge performance gains in extracting knowledge from diverse, unstructured data sets. Problems not solved well by today’s supercomputers are effectively addressed, including quantitative financial modeling and analytics, social media, NORA, fraud detection, optimization, and genomics, among others.

Ken Jacobsen

About Ken Jacobsen

Ken has more than 35 years of experience building and leading highly technical organizations. Most recently he was the CEO at GreenWave Systems Inc., a large, custom supercomputer engineering company. Previously, he lead all technical efforts at the Morse Project, creating smart devices for the Internet of things, and was the CEO of Foveal Recall Inc., which created specialized software tools for image object characterization and recognition. He has also held the position as EVP of Informatics at Incyte Genomics, and VP of Applications at SGI. Ken holds an undergraduate degree from the California Institute of Technology and a PhD from the University of California at Berkeley.

The Center for Research Computing’s Sandra Gesing Collaborates to Improve Usability of Science Gateways Across Domains

The National Science Foundation (NSF) has awarded a five-year $15 million grant to a collaborative team led by the San Diego Supercomputer Center (SDSC) at UC San Diego to establish a Science Gateways Community Institute to accelerate the development and application of highly functional, sustainable science gateways that address the needs of researchers across the full spectrum of NSF directorates.

“Gateways foster collaborations and the exchange of ideas among researchers and can democratize access, providing broad access to resources sometimes unavailable to those who are not at leading research institutions.” said Nancy Wilkins-Diehr, SDSC associate director and principal investigator for the project.

The new SGCI award brings together expertise from a wide range of partner universities and institutions including Elizabeth City State University in North Carolina; Indiana University; University of Notre Dame; Purdue University; the Texas Advanced Computing Center (TACC) at the University of Texas, Austin; and the University of Michigan at Ann Arbor.

Sandra Gesing, Computational Scientist at the Center for Research Computing and Research Assistant Professor of Computer Science and Engineering at the University of Notre Dame, and part of the SGCI collaborative team, will support the promotion of national and international communication between gateway developers.

"My research has focused for 10 years on science gateways - also known as virtual research environments or virtual laboratories - and I am thrilled to be part of the SGCI. We envision the institute as a central contact, support and information point for the community and for national as well as international collaborations on science gateways,” said Gesing. “With well-designed measures we aim at improving the usability and sustainability of science gateways and thus enhancing the reusability of scientific methods and reproducibility of scientific research."

The SGCI is one of two major NSF awards announced today to establish the Scientific Software Innovation Institutes (S2I2) that will help increase the usability and sustainability of computer science tools.

“The Institutes will ultimately impact thousands of researchers, making it possible to perform investigations that would otherwise be impossible and expanding the community of scientists able to perform research on the nation’s cyberinfrastructure,” said Rajiv Ramnath, program director in the division of Advanced Cyberinfrastructure at NSF.

Th full NSF S2I2 award announcement is available here.

The work is funded via NSF award number is ACI-1547611 and more information about SGCI is available here.

Read the full story here.

Whole Tale enables new discovery by bringing ‘life’ to research articles

Directions for a new piece of "some assembly required" furniture are only useful if the user has the parts listed in the instruction manual. That makes putting those coffee tables and bookcases relatively easy to put together, compared to designing and constructing your own from scratch.

Scientists at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign, in collaboration with the Center for Research Computing at the University of Notre Dame, Texas Advanced Computing Center, Computation Institute at the University of Chicago and Argonne National Laboratory; and the National Center for Ecological Analysis and Synthesis at University of California, Santa Barbara, are hoping to do the same thing with computer code. "Whole Tale" a new, five-year, $5 million National Science Foundation-funded Data Infrastructure Building Blocks'; (DIBBs) project, aims to give researchers the same instructions and ingredients to help ensure reproducibility and pave the way for new discoveries.

Whole Tale will enable researchers to examine, transform, and then seamlessly republish research data, creating "living articles" that will enable new discovery by allowing researchers to construct representations and syntheses of data.

"Whole Tale" alludes to both the "whole publication story" and the "long tail of science." The project will create methods and tools for scientists to link executable code, data, and other information directly to online scholarly publications, whether the resources used are small-scale computation or state-of-the-art high-performance computing.

"The Center for Research Computing at the University of Notre Dame is very proud to be part of this collaboration. Whole Tale will contribute to openness and reproducibility of scientific research. We strive to deliver tools that scientists can use to preserve the integrity of science", says co-PI at Notre Dame Jarek Nabrzyski, director of the Center for Research Computing and a concurrent associate professor of Computer Science and Engineering.

How will Whole Tale work? Through a web-browser, a scientist will be able to seamlessly access research data and carry out analyses in the Whole Tale environment. Digital research objects, such as code, scripts, or data produced during the research, can be shared between collaborators. These will be bundled with the paper to produce a “living article,” accessible by reviewers and the scientific community for in-depth pre- and post-publication peer review. Augmenting the traditional research publication with the full computational environment will enable discovery of underlying data and code, facilitating reproducibility and reuse. Whole Tale will provide an environment of multiple, independently developed frontends (e.g., Jupyter, RStudio, or Shiny) where data can be explored in myriad ways to yield better opportunities for understanding, use, and reuse of the data.

For the whole story, visit: http://www.ncsa.illinois.edu/news/story/whole_tale_enables_new_discovery_by_bringing_life_to_research_articles

Follow the development of this five-year project at: http://wholetale.org/.

The May 2016 Newsletter has been released with the latest information on activities and events.

Visit the Newsletter page for back issues.

Student Engineers Reaching Out (SERO) students at the University of Notre Dame developed a life changing device for local woman. Check out the full story featured on 16 WNDU.

 

SERO helps Katelyn Toth regain independence

 

For more information on SERO, under the direction of adviser Professor Paul Brenner, click here.

Sandra Gesing, Research Assistant Professor of Computer Science and Engineering and Center for Research Computing faculty member at the University of Notre Dame, is bridging the gap between researchers, their data and its analysis with her work on science gateways. Science gateways are specific software solutions meeting researchers' needs while hiding complex underlying infrastructures - from distributed computing infrastructures to distributed data and/or Big Data in Clouds, grids or local resources. Scientists working on drug design or striving to eradicate viscious diseases use science gateways to access, evaluate, analyze and run valuable simulations with their data without having to be experts in information technology. Gesing, her work and her views on usability and sustainability of computational applications as well as reproducibility of science are highlighted in International Innovation. Learn more about Gesing from her Q&A and in the article, "Gateways to advancing science", featuring Gesing’s impact on the scientific community and how she is advancing research with technology.

Link to article in International Innovation

Monday, May 9, 2016

1:00 p.m.-2:00 p.m.

Notre Dame Room, LaFortune Student Center

 

J.C. Desplat

Director

Irish Centre for High-End Computing (ICHEC)

High-Performance Computing in the land of W.B. Yeats: So much more than just supercomputers

The Irish Centre for High-End Computing (ICHEC) was established in 2005 as the national High-Performance Computing (HPC) centre in Ireland, a country with a thriving technology scene. At that time, the Centre’s core mission was the delivery of HPC resources to academia, including training and education activities. ICHEC has since supported over 1,400 researchers.

In 2009, ICHEC initiated a comprehensive programme of industry engagement using a two-pronged approach combining technology co-development and commercial service provision. The former has led to fruitful collaborations with Intel, Xilinx and DDN around their latest technology platforms; the latter has seen ICHEC bring its Technical Computing expertise to MNCs and SMEs in areas as diverse as the oil & gas, financial services, renewable energy, medical devices and analytics.

Finally, ICHEC also extended its reach to the Public Sector, with flagship collaborations with Met Éireann (the Irish national weather forecasting agency) and the Central Statistics Office (Ireland’s national statistics institute). Following the award of contract from the European Space Agency (ESA) through Irish technology company Skytek, ICHEC has embarked on the development of a portal for ESA Sentinel and other Earth observation data, providing access to both archived products, real-time data and on-demand processing, developing services of notable societal and economic value.

ICHEC’s strength comes from its people, its expertise. Small countries cannot afford to compete with larger countries on hardware Infrastructure, nor can they justify such investments. Instead, ICHEC has chosen to focus on the effective use of HPC methodologies and novel technologies with application to strategic areas such as energy efficient programming for IoT, remote observation for precision agriculture and planning, high-resolution modelling for renewable energy, and of course big data. Novel technologies underpin innovation and HPC methodologies are key to unlocking their potential.

We currently have a number of vacancies for HPC and Big Data specialists, as well as systems administrators and domain experts (incl. remote observation). We are also interested in establishing new collaborations with like-minded groups and organisations worlwide.

About J.C. Desplat

JC studied at Sheffield Hallam University in the UK where he received a PhD for his work on “Monte Carlo simulations of amphiphilic systems” in 1996. His first experience of parallel computing dates back to 1992 on T800 transputers.

He later joined the Edinburgh Parallel Computing Centre (EPCC) in 1995, where he spent nearly ten years. The highlights of his employment in Scotland include his involvement in Prof. Michael E. Cates’ research group for whom he co-developed the parallel generalised Lattice-Boltzmann code “Ludwig” that enabled a number of high impact publications, as well as negotiating and setting up EPCC’s involvement in the DEISA (Distributed European Infrastructure for Supercomputing Applications) and HPC-Europa projects.

JC joined the Irish Centre for High-End Computing (ICHEC) in 2005 as Technical Manager (and employee #5) and later become Associate Director (2007) and Director (2012). His expertise has proven crucial in establishing ICHEC as a highly respected centre for High-Performance and Technical Computing within only a few years of its creation.

Over his time in Ireland, JC has secured grants to the total value of €23.3m (c. $26.3m) from a number of Irish Government Departments and funding agencies as well as from the European Commission’s FP7 and H2020 programmes. He also raised a further €3.4m (c.$3.9m) from commercial services.

JC holds the position of “Honorary Professor of Computational Science” at the Dublin Institute for Advanced Studies (DIAS) since 2008, and of Adjunct Professor in the School of Physics at NUI Galway since 2012. He was a member of the Digital Humanities Observatory Management Board (2008-2012) and of the Environmental Protection Agency (EPA) Climate Change Coordination Committee (2008-2013). More recently, he was a member of the ICT Sub-Committee of the Irish Medical Council (2011-2013) and the UK Engineering and Physical Sciences Research Council (EPSRC) Research Infrastructure Strategic Advisory Team (2011-2015). JC was awarded the distinction of Chevalier de l’Ordre des Palmes Académiques by the French government in 2016.

Tuesday, April 19, 2016

3:00 p.m.

Notre Dame Room, LaFortune Student Center

 

E. Lynn Usery

U.S. Geological Survey

 

Ontology and Semantics for Topographic Information

The U.S Geological Survey (USGS) is developing an ontology and semantic representation for topographic data and information. The ontology development has generated a taxonomy of all features on standard topographic maps, a formal machine readable vocabulary of feature names and definitions, predicates formed from attributes and relationships of the features, and actual instance data, including geometric coordinates and topological relations, encoded as predicates in a machine interpretable triple format. The geographic features are being developed as Resource Description Framework (RDF) triples with Uniform Resource Identifiers (URIs) and interlinked to become a part of Linked Open Data on the Semantic Web. We have developed semantics of vector- and point-based geographic data, including transportation, hydrography, boundaries, structures, and geographic names from basic tables of attributes and relationships of geographic features in these categories of data. We have created a graphical user interface to query these data with the SPARQL Protocol and RDF Query Language and to visualize these data in a cartographic rendering.  The interface also supports data integration through federated queries of USGS data with other RDF and Linked Open Data available on the Semantic Web. Raster-based data including terrain, land cover, and orthographic images pose different requirements since no geographic features are readily defined and identified in these data. Through collaboration with ontologists and others at GeoVocamps, a set of basic ontology design patterns have been defined for terrain features. We are now beginning to use object-based image analysis and machine learning techniques, specifically neural networks, to automatically extract these features from basic data sources including lidar point clouds, elevation matrices, orthographic images, and scanned topographic maps. The extracted features are then built as RDF and become available through URIs, in the same manner as the geographic features built from the vector-based data. The ultimate goal of this work is to make all geographic features, currently shown on USGS topographic maps and in our databases of geographic information systems, available on the Semantic Web with each feature directly accessible through its URI.

About E. Lynn Usery

E. Lynn Usery is a Research Physical Scientist and Director of the Center of Excellence for Geospatial Information Science (CEGIS) with the U.S. Geological Survey (USGS). He has worked as a cartographer and geographer for the USGS for more than 25 years and a professor of geography for 17 years with the University of Wisconsin-Madison and the University of Georgia. Dr. Usery established a program of cartographic and geographic information science (GIScience) research that evolved into CEGIS. He has served as President of the University Consortium for Geographic Information Science (UCGIS) and the Cartography and Geographic Information Society (CaGIS) and is currently President of the American Society for Photogrammetry and Remote Sensing and Vice-President of the International Cartographic Association. He was editor of the journal Cartography and Geographic Information Science and is currently Associate Editor for the International Journal of Cartography. Dr. Usery is currently Chair of the Local Organizing Committee and Conference Director for the 2017 International Cartographic Conference which will be held in Washington, D.C. He is a Fellow of CaGIS and UCGIS and received the CaGIS Distinguished Career Award in 2012. Dr. Usery has published more than 100 research articles in cartography and GIScience. He earned a BS in geography from the University of Alabama and MA and Ph.D. degrees in geography from the University of Georgia. His current interests and research are in theoretical GIScience including geospatial ontologies and semantics, map projections, multidimensional data models for lidar, and high performance computing for spatial data. 

Wednesday, April 20, 2016

2:30 p.m.

DeBartolo Hall, Room 120

 

Thomas Connor

Senior Lecturer, Cardiff School of Biosciences

 

Overcoming Biological Silos in Bioinformatics using Cloud Computing

Genome sequencing has made it possible to examine fundamental biological questions over a huge range of scales; from bacteria to man. Since the first bacterial genome was published 20 years ago, there has been an explosion in the production of sequence data, fuelled by next-generation sequencing, placing biology at the forefront of data-driven science. As a consequence, there is now huge demand for the physical infrastructure to produce, analyse and share software and datasets. This need is compounded by the continuing lack of trained bioinformaticians to analyse the data.  These needs are made more acute by the siloisation of bioinformatics development and data sharing. Often software is developed on local systems that are impossible to replicate outside of the host institution. This acts as a barrier to data and software sharing, and reduces the reproducibility of biological research that is underpinned by bioinformatics.

To begin to address this need, in 2014 the UK Medical Research Council made a ~£50m investment in “big data” to support the development of new research infrastructures. The £8.5m CLoud Infrastructure for Microbial Bioinformatics (CLIMB) was the only award to a microbial consortium and is one the largest investments in microbial genomics bioinformatics ever made. CLIMB will provide training alongside bioinformatics infrastructure as a service to the academic UK medical microbial community. In this talk will introduce some of the issues that microbial bioinformaticians face in the UK, and how we are overcoming these using our newly deployed infrastructure.

 

The NSF-funded Data and Software Preservation for Open Science (DASPOS, daspos.org) project is hosting a  workshop on Container Strategies for Data & Software Preservation that Promote Open Science at the University of Notre Dame. This two-day linux container centric workshop will feature keynote speakers, lightning talks, demonstrations, and hands-on breakouts related to container strategies for software and data preservation that promote open science, science reproducibility and re-use.
 
Participants will have the opportunity to learn about how others are using Docker and related container tools like ReproZip, Umbrella, SmartContainer, NDS Dashboard in environments like the National Data Service, Open Science Framework, government, publisher and institutional repositories. The workshop organizers are confident that no matter the participants’ level of expertise with container tools and strategies for preserving software and computational environments they will learn from other stakeholders, benefit from the well designed online use cases, and leave the workshop better able to share their data, software and other digital research objects.
 
Workshop Date: May 19-20, 2016
 
Location: The workshop will take place on the campus of the University of Notre Dame at the Conference Center at McKenna Hall..
 

Workshop Format:

  • Keynote talks
  • Lightning talks* (See call)
  • End-to-end use case demonstrations
  • Hands-on containerization breakout sessions
  • “Preservation challenge” sessions (Team A will try to reproduce a computational experiment containerized & preserved by Team B)
Give a lightning talk at the workshop!
We want everyone to have an opportunity to share their enthusiasm and inspiration*! The workshop lightning talks provide a great opportunity to speak about your experiences, opinions or ideas relating to using containers for preservation in a short talk. They can be used to generate inspiration, discussion and participation in other sessions at the workshop, breakout sessions and even in the coffee breaks.

Call for participation / Actions

  • Register: Workshop Registration
  • Express your interest to present by sending an email to This email address is being protected from spambots. You need JavaScript enabled to view it. & submit your short (maximum 1 page) abstract of your lightning talk before March 31, 2016. *All accepted lightning talks will be eligible for travel support  
Important Dates:
  • Lightning talk submissions: 31 March 2016 (any time of day, no further extensions; email submissions to This email address is being protected from spambots. You need JavaScript enabled to view it.).
  • Lighting talk decisions: 8 April 2016
  • Travel support requests: 15 April 2016 (email to This email address is being protected from spambots. You need JavaScript enabled to view it.)
  • Decisions on travel support announced: 19 April 2016
  • Workshop: 19-20 May 2016, University of Notre Dame, USA.
  • Post-workshop report writing (participation is open to all): 30 June 2016

Nominations Due

Nominations for Commencement 2016 awards are due Monday, March 28.

Description of the Award

This award recognizes outstanding contributions in the areas of computational sciences and visualization. Such contributions may include, but are not limited to: 1) applications of high performance computation and/or visualization technology; 2) development of algorithms, codes, software environments or other tools for better using high performance computing and/or visualization. The nominated work need not have been done using CRC hardware or software. Up to three awards may be presented. Awardees will receive a $1000 cash award and a plaque.

For a list of past winners, see the CRC Computational Science and Visualization Awards.

Eligibility

This award is open to all current students seeking an advanced degree, as well as to recent graduates.

Nomination Procedure

Award nominations should be accompanied by:

  1. the student’s curriculum vitae;
  2. a one-page statement from the student summarizing the nominated work;
  3. a letter from the student’s graduate research adviser that clearly explains:
    • the intellectual significance of the work in the context of the student’s discipline
    • which parts of the work were done individually by the student and
    • the (potential) broader impact of the work on society at large;
  4. and the URL of a Web page at which additional details or supplementary information concerning the work is presented. Nominations will be judged on the quality and impact of the student’s work and not on any aspects of the design of the requested Web page.

Please also provide the following information:

  • The nominee’s mailing (and e-mail) address and phone number;
  • The nominator’s name, e-mail address and phone number;
  • Refereed and non-refereed publications should be clearly identified in the curricula vitae. Please do not include copies of publications (these may be posted on the requested web page).

Nominations should be submitted as a formal letter of nomination to This email address is being protected from spambots. You need JavaScript enabled to view it. in the Graduate School (502 Main Building; 1-8052)

Award winners will be invited to attend the Graduate School Awards dinner on Friday, May 13, 2016.