UCF Research Cyberinfrastructure (RCI)

2021 Workshops


Research Computing and Data Workshops Series

In 2021, we are pleased to bring to the UCF Research community a series of workshops on scientific computing and research data management. These workshops are being jointly presented by UCF Library, UCF Graduate and Research IT, and UCF Advanced Research Computing Center (ARCC). Please check back for new/updated information.

Please note that at this time the workshops are only being offered to faculty, post-docs and students who are actively engaged in research as we have limited capacity due to the hands-on nature of the sessions. If you are interested in a session that had reached capacity, please email us and we will try to schedule a repeat of it. If you would like a separate session for your research team, please email us and we may be able to have a separate session for your research group.


Subscribe to Updates

filter by:

Foundations of Data Management: Finding Data Sources

Location
Online Session, please see registration link for details.
Description
Data and statistics play an important role in conducting research, yet understanding how to find, analyze, and manage these can be complicated. If you are interested in developing data skills, this workshop will provide introductory information to aid you on your path to being a data expert. The workshop will introduce the basic concepts of data, its importance in research, and how to find quality data in different disciplines. Subject resources in select areas such as science, engineering, social science, government, and humanities will be covered in this session.
Presenters

Carpentries-Style: Introduction to bash shell scripting/Linux basics (hands-on)

Location
Online Session, please see registration link for details.
Audience

This workshop is aimed at graduate students and other researchers, but all are welcome to attend. This is a basics level workshop for an audience who intend to start working with Unix in the future. You don't need to have any previous knowledge of the tools that will be taught.

Description
The Unix shell has been around longer than most of its users have been alive. It has survived so long because it is a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they are not typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including "high-performance computing" supercomputers). These lessons will start you on a path towards using these resources effectively.
Presenter

Advanced Research Computing Center: High-performance computing using UCF Stokes computing cluster (hands-on)

Location
Online Session, please see registration link for details.
Description
Computational research can analyze models and/or data to reach new conclusions in faster ways or more complex scenarios. UCF has a 4000+ core cluster for general-purpose computation for research across many fields of academic work. This workshop will review capabilities of the UCF Advanced Research Computing Center in general, with a focus on the general-purpose cluster (known as Stokes). Storage system usage, job scheduling and account balancing, and job submission will be covered with an interactive hands-on session (if an attendee does not yet have an ARCC account, please be sure to request an account ahead of time via the ARCC web site at arcc.ist.ucf.edu).
Presenters
Presenter Headshot - Jamie Schnaitter

Jamie Schnaitter

Foundations of Data Management: Managing Your Data

Location
Online Session, please see registration link for details.
Description
Healthy data management provides a foundation for strong research. This session will introduce participants to the components of a data management plan, data management processes, the DMPTool, and data repositories. Topics include data expectations, data and metadata standards, access and sharing considerations, and data storage and preservation.
Presenter

Advanced Research Computing Center: Introduction to Globus for New Users, for Research Data Transfer/Sharing

Location
Online Session, please see registration link for details.
Description

Globus is a software that enables high-speed, reliable research data file transfers using GridFTP protocol to a research computing facility, for example to and from the UCF's Advanced Research Computing Center (UCF ARCC), or external resource computing resource such as XSEDE.

We will provide a summary review of Globus features targeted at those researchers new to Globus. We will demonstrate how to transfer and share data, and install a Globus Connect Personal endpoint on your laptop. Some use cases will be demonstrated.

Presenter

Carpentries: Programming and Plotting with Python (hands-on)

Location
Online Session, please see registration link for details.
Audience

This workshop is aimed at graduate students and other researchers, but all are welcome to attend. This is a basics level workshop for an audience who intend to start working with Python in the future. You don't need to have any previous knowledge of the tools that will be taught.

Description
This lesson teaches novice programmers to write modular code to perform data analysis using Python. The emphasis, however, is on teaching language-agnostic principles of programming such as automation with loops and encapsulation with functions, see Best Practices for Scientific Computing and Good enough practices in scientific computing to learn more. The example used in this lesson analyses a set of 12 files with simulated inflammation data collected from a trial for a new treatment for arthritis. Learners are shown how it is better to automate analysis using functions instead of repeating analysis steps manually.
Presenter

Advanced Research Computing Center: GPU-based High-performance computing using UCF Newton computing cluster (hands-on)

Location
Online Session, please see registration link for details.
Description
Computational research can analyze models and/or data to reach new conclusions in faster ways or more complex scenarios. While UCF has a 4000+ core cluster for general-purpose computation for research across many fields of academic work, some fields of research can leverage Graphics Processing Units (GPUs) to further increase computational speed; therefore, UCF also has a small, 20-node GPU computational cluster. This workshop will review capabilities of the UCF Advanced Research Computing Center in general, with a focus on the GPU cluster (known as Newton). Job submission using the GPU hardware will be covered with an interactive hands-on session (if an attendee does not yet have an ARCC account, please be sure to request an account ahead of time via the ARCC web site at arcc.ist.ucf.edu)
Presenters
Presenter Headshot - Jamie Schnaitter

Jamie Schnaitter

Machine Learning: Introduction to Amazon SageMaker for Building, Training, and Tuning Models Automatically (hands-on)

Location
Online Session, please see registration link for details.
Description
This workshop will consist of a 45-minute presentation and 45-minute hands-on lab. In the presentation, we will introduce you to Amazon SageMaker, which helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for machine learning. A key aspect of training machine learning models is the ability to tune them to the highest accuracy. In the lab, you will learn how to train and tune your ML models and deploy them into production. You will also learn real-time and and batch inference techniques to get predictions from ML models.
Presenters
Presenter Headshot - Sherry Ding

Sherry Ding, Ph.D.
AI/ML Solutions Architect, AWS

Presenter Headshot - Gabriel Brackman

Gabriel Brackman
Solutions Architect, AWS

Foundations of Data Management: Maximizing the Impact of Your Data

Location
Online Session, please see registration link for details.
Description
Citation metrics provides useful quantitative data regarding the impact of a scholar's research. Learning how to maximize your reach using tools like Web of Science, Google Metrics, ORCID, ResearchGate, and Academia.edu among others, doesn't have to be as daunting of a task as it first appears. This session will cover information on how to evaluate citation metrics as well as provide tips on how to manage your online research profiles.
Presenters

XSEDE: New User Training (hands-on)

Location
Online Session, please see registration link for details.
Description

The NSF Extreme Science and Engineering Discovery Environment (XSEDE) is a single virtual system that gives US scientists access to advanced computing resources and services.

This session will provide UCF researchers an introduction to the XSEDE User Portal and Resources. An overview of capabilities accessible through the portal will be demonstrated in an interactive session, with the participants able to follow along with their own portal account. Additionally, the basics of XSEDE resource architecture, covering compute, storage, and environment management, will be presented, which will motivate an introductory job submission tutorial component. Finally, the participants will be able to learn the basics of file transfer with Globus Online.

Presenter

Collaboration Tools for Research Teams

Location
Online Session, please see registration link for details.
Description
This session will provide an overview of existing tools available for UCF researchers for collaboration, explain features and limitations of each, describe the steps for requesting access for external collaborators and present some tips and tricks. Specifically, we will touch upon Microsoft Teams, OneDrive, Dropbox and Slack.
Presenters
Presenter Headshot

Carlos Acevedo

Expand Your Research Computing Capability on the Open Science Grid

Location
Online Session, please see registration link for details.
Description
Could your computational work benefit from the ability to concurrently run hundreds or thousands of independent computations, for free? The Open Science Grid, or OSG, is a worldwide network of computing power contributed by colleges, national labs, and other research-supporting institutions. For researchers at these types of institutions in the United States, the National Science Foundation funds the OSG's Open Science Pool, with capacity from more than 100 sites to drive scientific research forward through use of distributed high throughput computing. This presentation will give an overview of the computing power available through the Open Science Pool and how this computing capability can accelerate research.
Presenter

Carpentries: R for Scientific Analysis

Location
Online Session, please see registration link for details.
Description
The goal of this lesson is to teach novice programmers to write modular code to perform a data analysis. R is used to teach these skills because it is a commonly used programming language in many scientific disciplines. However, the emphasis is not on teaching every aspect of R, but instead on language agnostic principles like automation with loops and encapsulation with functions (see Best Practices for Scientific Computing to learn more). This lesson is a translation of the Python version. The example used in this lesson analyzes a set of 12 data files with inflammation data collected from a trial for a new treatment for arthritis (the data was simulated). Learners are shown how it is better to create a function and apply it to each of the 12 files using a loop instead of using copy-paste to analyze the 12 files individually.
Presenter

University of Florida Research Computing Events

filter by:

2021 Spring HiPerGator Symposium (Free Registration)

Location
Online Session, please see registration link for details.
Description

UF Information Technology (UFIT) will host the Spring 2021 HiPerGator Symposium on March 30. The Spring 2021 symposium will feature presentations from UF's Artificial Intelligence Research Catalyst Fund awardees who are pursuing multidisciplinary applications of AI across the university. An overview of AI training and support services provided by UFIT's Research Computing staff will also be presented.

The Spring 2021 HiPerGator Symposium is open to everyone in the UF community, along with state and national constituents.


XSEDE Webinar Series

filter by:

XSEDE: An Introduction to Singularity: Containers for Scientific and High-Performance Computing

Location
Online Session, please see registration link for details.
Description

Singularity is an open-source container engine designed to bring operating system-level virtualization (containerization) to scientific and high-performance computing. With Singularity you can package complex scientific workflows --- software applications, libraries, and data --- in a simple, portable, and reproducible way, which can then be run almost anywhere. Once you've created your container, you can run it on the workstation in your lab, on a virtual machine in the public cloud, or on hundreds of thousands of compute cores on the world's largest supercomputers. Singularity is all about the mobility of compute.

In this webinar, we'll provide an overview of Singularity and how you might incorporate the use of containers in your own research. We'll also show you how to access and use some of the containerized applications that we make available to users on XSEDE systems like Comet and Expanse at SDSC.

Presenter
Presenter Headshot - XSEDE engagement and outreach team

XSEDE engagement and outreach team

XSEDE: An Introduction to Singularity: Containers for Scientific and High-Performance Computing

Location
Online Session, please see registration link for details.
Description

XSEDE, along with the Pittsburgh Supercomputing Center, is pleased to present a two day Big Data and Machine Learning workshop.

This workshop will focus on topics such as Hadoop and Spark and will be presented using the Wide Area Classroom (WAC) training platform.

Presenter
Presenter Headshot - XSEDE engagement and outreach team

XSEDE engagement and outreach team

XSEDE: GPU Programming Using OpenACC

Location
Online Session, please see registration link for details.
Description

XSEDE, along with the Pittsburgh Supercomputing Center is pleased to present an OpenACC GPU programming workshop.

OpenACC is the accepted standard using compiler directives to allow quick development of GPU capable codes using standard languages and compilers. It has been used with great success to accelerate real applications within very short development periods. This workshop assumes knowledge of either C or Fortran programming. It will have a hands-on component using the Bridges-2 computing platform at the Pittsburgh Supercomputing Center.

Presenter
Presenter Headshot - XSEDE engagement and outreach team

XSEDE engagement and outreach team


IDEAS Project: Best Practices for HPC Software Developers (Webinars)

filter by:

Extreme-scale Scientific Software Stack (E4S)

Location
Online Session, please see registration link for details.
Description

With the increasing complexity and diversity of the software stack and system architecture of high performance computing (HPC) systems, the traditional HPC community is facing a huge productivity challenge in software building, integration and deployment. Recently, this challenge has been addressed by new software build management tools such as Spack that enable seamless software building and integration. Container based solutions provide a versatile way to package software and are increasingly being deployed on HPC systems. The DOE Exascale Computing Project (ECP) Software Technology focus area is developing an HPC software ecosystem that will enable the efficient and performant execution of exascale applications. Through the Extreme-scale Scientific Software Stack (E4S), it is developing a curated, Spack-based, comprehensive and coherent software stack that will enable application developers to productively write highly parallel applications that can portably target diverse exascale architectures. E4S provides both source builds through the Spack platform and a set of containers that feature a broad collection of HPC software packages. E4S exists to accelerate the develo PMent, deployment, and use of HPC software, lowering the barriers for HPC and AI/ML users. It provides container images, build manifests, and turn-key, from-source builds of popular HPC software packages developed as Software Develo PMent Kits (SDKs). This effort includes a broad range of areas including programming models and runtimes (MPICH, Kokkos, RAJA, OpenMPI), develo PMent tools (TAU, PAPI), math libraries (PETSc, Trilinos), data and visualization tools (Adios, HDF5, Paraview), and compilers (LLVM), all available through the Spack package manager. The webinar will describe the community engagements and interactions that led to the many artifacts produced by E4S, and will introduce the E4S containers that are being deployed at the HPC systems at DOE national laboratories.The presenters will discuss the recent efforts and techniques to improve software integration and deployment for HPC platforms, and describe recent collaborative work on reproducible workflows between E4S and the Pantheon project. Pantheon provides a set of working examples of end-to-end workflows using ECP apps, infrastructure and postprocessing, focused on common vis/analysis operations and workflows of interest to application scientists and show a video of the workflow.

Presenters

Sameer Shende
University of Oregon and ParaTools

David Honegger Rogers
Los Alamos National Laboratory

Good Practices for Research Software Documentation

Location
Online Session, please see registration link for details.
Description

This webinar aims to introduce the importance of software documentation and the different approaches that may be taken at various stages, and on various levels, in the software develo PMent life cycle. Through the sharing of examples and stimulative questions, the speakers aim to encourage the audience to reflect on the relationship between documentation and process, and to make informed choices about when and how to document their software.

Presenters

Stephan Druskat
Friedrich Schiller University Jena

Sorrel Harriet
Leeds Trinity University


Microsoft Research Webinar Series

filter by:

Microsoft Research: Project InnerEye: Augmenting cancer radiotherapy workflows with deep learning and open source

Location
Online Session, please see registration link for details.
Description

Medical images offer vast opportunities to improve clinical workflows and outcomes. Specifically, in the context of cancer radiotherapy, clinicians need to go through computer tomography (CT) scans and manually segment (contour) anatomical structures. This is an extremely time-consuming task that puts a large burden on care providers. Deep learning (DL) models can help with these segmentation tasks. However, more understanding is needed regarding these models' clinical utility, generalizability, and safety in existing workflows. Building these models also requires techniques that are not easily accessible to researchers and care providers.

In this webinar, Dr. Ozan Oktay and Dr. Anton Schwaighofer will analyze these challenges within the context of image-guided radiotherapy procedures and will present the latest research outputs of Project InnerEye in tackling these challenges. The first part of the webinar will focus on a research study that evaluates the potential clinical impact of DL models within the context of radiotherapy planning procedures. The discussion will also include the performance analysis of state-of-the-art DL models on datasets from different hospitals and cancer types, and we'll explore how they compare with manual contours annotated by three clinical experts.

The second part of the talk will introduce the open-source InnerEye Deep Learning Toolkit and how it can provide tools to help enable users to build state-of-the-art medical image segmentation models in Microsoft Azure. There will be examples illustrating step-by-step how the toolkit can be used in different segmentation applications within Azure Machine Learning (Azure ML) infrastructure. This includes model specification, training run analysis, performance reporting, and model comparison.

Presenters

Ozan Oktay
Senior Researcher in the Health Intelligence Group at Microsoft Research Cambridge

Anton Schwaighofer
Principal Software Engineer in the Health Intelligence Group at Microsoft Research Cambridge