NSD Adv. Comp. WG meeting

US/Pacific
Description

ZOOM:

We can meet in the Swiatecki ROOM - building 70 - room 228

    • 14:00 14:20
      Discussion on collaboration opportunities with CRD, Amazon, etc 20m

      From Science IT

      https://docs.google.com/presentation/d/1f4j1K7xjv93k3Jh7hmJ9N7pjxbwxrjCHPeFL45_L1mQ/edit?usp=share_link 


      From Amazon - Noah - post meeting:

       

      It was great meeting everyone today, and we only really had time to scratch the surface of the interesting work you can do on AWS. There was a couple of introductory use cases we covered that I wanted to provide some reading material on:

       

      • HPC bursting during crunch time
        • ParallelCluster is the primary tool I mentioned for creating HPC clusters in AWS. ParallelCluster can provision clusters similar to the environment at NERSC for portable workloads. We also have an EnginFrame HPC Connector detailed in this blog, which can provide a single interface for hybrid HPC resources between AWS and on-prem.  
      • ML Tools
        • We broadly categorize machine learning related services into two categories: AI services that will typically use pretrained models to perform common tasks like speech recognition and OCR, and  ML services that provide tools to train your own models. SageMaker is the primary ML service that has a suite of different features to perform every step of the ML process, to help build, train, and deploy ML based solutions.
      • Hosting web applications:
        • There’s a lot of ways to build web applications, from VM hosted traditional applications to fully serverless event-driven apps. The options for resiliency and DR vary based on the architecture, but that is something I can help you navigate. The Materials Project uses Elastic Container Service (ECS) to host serverless containers with the Fargate platform. We recently published a case study with the Materials Project that has some more details. 
      • Distributing large datasets:
        • The Open Data program is a platform that provides free public access to high value datasets. In the Open Data Sponsorship Program, AWS will assume the cost of hosting and distributing datasets if they meet the qualifications of the program. S3 is a powerful tool for distributing data with access controls, and storage of large datasets.

       

      Please let me know if there were any other topics of interest and we can set up a followup discussion to go into more detail. The account team is here as a resource for you to understand and make use of AWS services.

       

      Thanks,

      noah leuthaeuser

    • 14:20 14:30
      Slides on NSD-CRD collaboration 10m
      • need for a short presentation (3 slides) on the current  NSD-CSA connections - by Thursday this week
      • context: meeting between PSA and CSA ALDs and Division Directors
      • request for input from your area
    • 14:30 14:50
      AOB 20m