Welcome to the Github migration page. Here we will keep you posted on how the migration is progressing and what problems you might encounter as a result.

    Why are we migrating?

    For years Deltares has happily used Github under the following location: https://github.com/Deltares. Here people where free to create repositories, invite collaborators and try out all feature Github had on offer. The space was managed by a few Deltares colleagues, who, when asked could arrange things for other members.

    As of July 2023 this changed. The Deltares Management Team (MT) decided that it was no longer allowed to use cloud services that store user data on US servers. As a result of this decision GITHUB.com could no longer be used as official repository for Deltares products. This led to the implementation of an on-premises GITLAB installation.  

    Since then, many discussions have taken place and the Data Protection laws have changed. So, MT has reconsidered their earlier decision and has decided that GITHUB.com should now become Delatares' central GIT repository. But before this can take effect the existing https://github.com/Deltares organization needs to be upgraded:

    1. All users are required to login using the Deltares identity provider MyDeltares
    2. The Deltares organization should contain only repositories linked to the Deltares Product Management Teams (PMTs)
    3. All non-PMT related repositories must be transferred to the new organization Deltares-researche
    4. All Deltares repositories that are currently located on other services than GITHUB (e.g. Azure, Bitbucket or Gitlab) should migrate to the Deltares GITHUB environment.

    The migration steps

    Step 1: Setup an enterprise environment (status = done)

    A new GITHUB enterprise has been created named 'Deltares'. Under this enterprise a new organization has been created named 'Deltares research & development'. Linked to this environment are 124 seats (= licenses) and as of January 2024 an additional 100 seats will be added.

    Step 2: Move repositories (status = in progress)

    As mentioned above the non-PMT related repositories need to be removed from the main Deltares organization. To know what repositories needed to be move, an inventory was made of all repositories and their owners where asked to provide information. This inventory can be found here: repos-list-gihub-Deltares.xlsx . Then we started moving repositories to the new Deltares-research organization.

    On migration of a repository to the new organization, the list of collaborators for each repository is cleared. After migration, it is up to the repository administrator to invite new collaborators. All actions, issues and settings are migrated together with the repository.

    Step 3: Move the Deltares organization (status = pending)

    Once all non-PMT related repositories have been moved out of the Deltares organization, the Deltares organization will itself be moved into the Deltares enterprise. Technically nothing will move only it will become necessary to authenticate oneself through the Deltares identity provider. Additionally all repositories will be assigned to a team of administrators. One team for each PMT. It will then be up to these PMT teams to manage the repositories. All existing collaborators will be cleared and must then be re-invited on request by the PMT teams. 

    Step 4: Setup the processes (status = under development)

    To improve the working processes around the GITHUB repository, a Topdesk process is being created, where users can request a new repository. To handle these requests and any other GITHUB related calls, support processes are being setup.

    The new processes and support groups are:

    Github Owners: Group of administrators that manage the Deltares enterprise. Can be reached by e-mail: github-owners@deltares.nl

    Github Support: Support group for all GITHUB related questions. Can be reached by e-mail: github-support@deltares.nl

    Topdesk Request form: Form where users can request a new repository.

    PublicWIKI page: Place for general information about the Deltares GITHUB environment and where one can find answers to frequently asked questions.

    Questions regarding co-creation

    Github is an ideal place to work together with colleagues and also external parties. In the new organization https://github.com/Deltares-research, there are a few ways to work with outside collaborators:


    1. Use PULL REQUESTS. Allow your repository to accept pull requests. This way collaborators do not require to be added (and managed) as 'Outside collaborators'. This works well for PUBLIC repositories.
    2. Invite an outside collaborator to your organization and then assign them to your repository. For PUBLIC repositories no license is claimed for PRIVATE repositories 1 license is claimed. No additional costs for adding the same collaborator to multiple repositories.

    Questions relating to migration

    I want to use my own Github account that I've used thus far (linked to my private e-mail address). Can I do that and if yes, how?

    It is possible to use your existing GITHUB account. However if you work for Deltares, you will have to assign your Deltares email to your GITHUB account. All Deltares employees will be invited using their Deltares email address.

    When you receive an invitation email:

    When you are invited to join the Deltares GITHUB Enterprise, an invitation email will be sent to your Deltares email address. By following the link in the invitation you will be redirected to GITHUB where you are required to login to MyDeltares:

    Use the Deltares Login button at the bottom of the login form. If you do not yet have a MyDeltares account, one will automatically be created for you. After login your MyDeltares account will be linked to your Github account.

    Login directly to the Deltares GITHUB Enterprise:

    When you browse to the Deltares Enterprise in GITHUB you will see all public repositories. It is however not possible to join the enterprise from here using your existing unlinked GITHUB account. You need to be invited by one of the repository owners first.

    Problem

    I want to request a new Github repository for my project or software product, but I do not know which of the two Deltares organizations I must choose: Deltares or Deltares-research

    Solution

    You do not need to choose anything, this will be done for you based on the information provided during the registration process.

    What is the difference between the two organizations?

    Deltares organization: This organization is meant to house all repositories that are linked to a Product Management Team (PMT) and contain production ripe software.

    Deltares-research: All repositories that do not fit into the Deltares organization.

    Problem

    After your repository is moved from https://github.com/Deltares to the new organization https://github.com/Deltares-research, it is possible that you are required to re-authorize your account in one or more of your local applications, such as VS Code, TortoiseGIT or on the GIT command line.

    Solution

    Follow the following steps to refresh you authorization token:

    1. Open the Credential Manager in Widows. ( type 'Credential Manager')
    2. In Credential Manager select 'Windows Credentials'


    3. Lookup the credentials for git:https://github.com and remove this.
    4. Re-initialise the cred manager in git bash: git config --global credential.helper manager-core
    5. Re-run git pull and follow the pop-up instructions to authenticate in a browser (which happened automatically for me with SSO).


    After your repository is moved from https://github.com/Deltares to the new organization https://github.com/Deltares-research, the list of collaborators has been cleared. Each repository is assigned one administrator who is tasked with inviting new collaborators.

    Inviting new collaborator can be done by the Administrator using the 'Add people' button.

    Invite collaborators using e-mail: It is important to know that when inviting new Deltares collaborators, you must invite them using their Deltares e-mail address. This assures that all members are part of the Deltares domain.

    For bigger repositories it is advised to user teams instead of adding individual users. At present it is not possible for administrators to create a Team. In such cases the administrator should send a request to the Github support group: github-support@deltares.nl

    No there is not.

    When you have questions regarding the new GITHUB environment you can post them to the GITHUB support team. We can then try to connect you to people who already have experience with GITHUB. Otherwise we suggest you contact your department head for funding.

    There is no fixed definition of 'production-ripe' software. It is up to the Product Management Teams (PMT) to decide if they wish to designate certain repositories / products as production ripe or as research. 

    All repositories under the Deltares organisations should be linked to one of the PMT;s. If unsure about the status of your repository please contact the PMT and discuss with them where your repository fits best.

    Questions relating to applications

    In the current Deltares organization, co-pilot has been enabled. There are 24 licenses available, all of which are currently taken. At present co-pilot is not enabled in the Deltares-research organization. This will change once the Deltares organization has been added to the Deltares enterprise.

    How we plan to use co-pilot in the future and how we plan to manage the licenses has not yet been worked out. For more information please contact: github-owners@deltares.nl .

    Apparently some users can no longer see their repositories in Zenodo. Even when their repositories have not been moved from the Deltares organization. Why this is the case is still unclear however the resolution is that one of the organization owners logs into Zenodo using ones Github account and activates the required repositories.

    Deltares does not support the use of GITHUB's own Large File Storage (LFS) environment. However it is possible to connect your repository to the Deltares S3 bucket (MinIO) using GIT-LFS.

    Please read the information on the following page: How to connect your GITHUB repository to a LFS?:

    If you wish to connect your repository to the Data Version Control (DVC) tool, then please read the information on the following page: How to connect your GITHUB repository to a LFS?:

    When coding software / running models / creating extensive configurations, your code space / project folder / configurations will most likely contain a mix of text files, data files and software binaries. All these files need to be place in a repository under version control and need to be managed as a whole. For the text files you will want to be able to compare differences between versions, in order to understand what has changed over time. This will not be the case for binary files or very large data files as humans are generally not well equipped to compare bits and bytes.

    In the 'past' SVN was an ideal place to store your whole repository in one place. Currently  SVN  is in the process of being phased out and as a replacement GITHUB has been introduced. What the advantages / disadvantages of both systems are will not be discussed here. Instead we will focus on how to setup your GITHUB repository to include both your text based files as your larger binaries.

    The problem with GIT (and therefor also GITHUB) is that it is not designed to handle large and or binary files. To overcome this problem GIT Large File Storage (LFS) was introduced.  The basic idea behind GIT LFS is that the actual binary file is not stored in your GITHUB repository. Instead only a reference to this file is stored. The actual binary file is stored in an Object Storage location (S3 bucket).

    Although GITHUB offers LFS out-of-the-box, using it's own cloud base storage facilities, Deltares has chosen not to use this. The reason being the costs involved in storing data on the servers of GITHUB and also the costs involved in up- and downloading data to and from these servers. Instead Deltares has chosen to host it's own Object Storage in the form of a MinIO server.

    So in short. You will have a GIT repository in one of the two Deltares GITHUB organizations; Deltares or Deltares-research, which will contain only your text base files and small data files. While your large files or binary files will be stored on the Deltares Minio server.

    To manage all your text-, large- and binary files as a single project you have three options to connect your GITHUB repository to the Deltares MinIO object store:


    Prerequisite: In the below guides we expect the user to have a basic understanding of GIT and its related commands.

    How to setup GIT-LFS?

    Setup your GITHUB repository:

    First you must setup your GITHUB repository by cloning this to your computer. If you do not yet have a GITHUB repository, you can request one 'Request a repository' page.

    Once your GITHUB repository is in place and up-to-date, continue with the GIT-LFS instructions.

    Setup GIT-LFS:

    Go to the GIT-LFS website and follow the instructions on how to download and install GIT-LFS. A good starting point is the 'Getting Started' section on the home page.

    GIT-LFS configuration files:

    .gitattributes:    Stored in the root folder of your repository. This file contains patterns of all files that GIT-LFS should track and manage as 'large files'.

    .gitattributes
    *.bin filter=lfs diff=lfs merge=lfs -text

    Useful examples for .gitattributes can be found here

    .lfsconfig:   Stored in the root folder of your repository. This file is necessary to point GIT-LFS to the MinIO server of Deltares instead of the default GITHUB LFS.

    .lfsconfig
    [lfs]
    url = "http://localhost:8080"

    Setup GIT-LFS API Proxy

    A Git LFS API proxy is needed to seamlessly integrate GIT-LFS with the S3-based MinIO API. The proxy serves as a bridge between GIT-LFS and the S3 storage protocol, and will translate Git LFS API calls into S3 API calls, ensuring that files tracked by Git LFS are correctly stored on S3.
    ( See N:\Deltabox\Publications\2023\ict\git-lfs-minio\ for Windows and Linux executables with example config.json )

    config.json:    Stored outside of the root folder of your repository. This file contains the MinIO end-point server of Deltares

    config.json
    {
        "serverListenAddr": ":8080",
        "minioHost": "s3.deltares.nl",
        "minioAccessKey": "gitlfs",
        "minioSecretKey": "UPONREQUEST",
        "minioBucket": "gitlfs",
        "minioURLExpires": 3600
    }


    Setup credentials:


    (THIS NEEDS TO BE UPDATED: The GIT-LFS credentials are separate from GITHUB ) In order for GIT-LFS to login to the MinIO API, it is required to provide credentials. This can be configured in the Windows Credential Manager on your (Windows) computer.

    Open the Credential Manager tool from the Control Panel and the credentials for the MinIO API as a new Generic Credentials entry.

    How to setup DVC?

    Setup your GITHUB repository:

    First you must setup your GITHUB repository by cloning this to your computer. If you do not yet have a GITHUB repository, you can request one 'Request a repository' page.

    Once your GITHUB repository is in place and up-to-date, continue with the DVC instructions.

    Setup DVC:

    Go to the DVC website and follow the instructions on how to download and install DVC. A good starting point is the 'Get Started' page.

    DVC configuration files:

    .dvcignore:    Stored in the root folder of your DVC project. This file contains patterns of all files that DVC should ignore.

    .dvcingore
    # Add patterns of files dvc should ignore, which could improve
    # the performance. Learn more at
    # https://dvc.org/doc/user-guide/dvcignore
    
    # Ignore secrets file
    .dvc/config.local

    .dvc/.gitignore:    Stored in the .dvc folder. This file is similar to the .dvcignore file however this file contains patterns of all files that GIT should ignore.

    .gitignore
    /config.local
    /tmp
    /cache

    .dvc/config:    Stored in the .dvc folder. Contains all DVC configuration that can be shared and can be uploaded into your repository.

    config
     [core]
        remote = miniostorage
        autostage = true
    ['remote "miniostorage"']
        url = s3://<path to your bucket>
        endpointurl = https://s3.avi.deltares.nl
        ssl_verify = false

    .dvc/config.local:    Stored in the .dvc folder. Contains all DVC configuration that cannot be shared nor uploaded into your repository

    config.local
    ['remote "miniostorage"']
        access_key_id = 
        secret_access_key = 

    How to setup custom scripts?

    How you will setup your scripting environment will strongly depend on the codding language of the source code in your GITHUB repository. But in all cases you can take advantage of the REWIND functionality of MinIO. This functionality allows you to restore your data folder of files to a given point in time. 

    For Python an example can be found here: https://github.com/robin-deltares/minio-py-rewind/blob/main/minio_rewind.py

    Rewind example
    from minio import Minio
    import minio_rewind
    
    # For access
    myMinioServer = 'my.minio.server'
    myAccessKey   = 'my_access_key'
    mySecretKey   = 'my_secret_key'
    
    # The path that will be recursively downloaded
    myBucketName = 'my_bucket_name'
    myPathName   = 'my_path_name'
    myRewind     = '2023.05.10T16:00' # Notation that mc uses
    
    # Minio client connection
    myClient = Minio(myMinioServer,
                     access_key=myAccessKey,
                     secret_key=mySecretKey)
    
    # Prepare the rewind-settings
    rewinder = minio_rewind.Rewinder(myClient,myRewind)
    
    # Download the objects
    rewinder.download(myBucketName,myPathName)

    Choosing between the above solutions

    GIT-LFS

    git-lfs is intended to be transparent to git, therefore it requires a customized server. Its learning process is short and fast. Some configuration commands, and bang! it is running, storing large files independently of the git repository. That's its only function, and it does it fine. Having an additional server is not a drawback, but instead a requirement for such transparency. Once configured, files are just handled by git, by means of git hooks (endpoints that are activated after git operations).

    Limitations of GIT-LFS can are documented here.

    DVC

    dvc is intended to provide independent management of large files for the final user. What dvc basically does is this: it just makes git ignore the files that you wish to control (adding them to .gitignore) and instead, it generates an additional file with the same name and the extension .dvc. So, in order to push a commit with its corresponding files, the user is required to manually "add" (equivalent to git commit, not to git add; there's no equivalent for the git stage in dvc) and "push" to both systems. This is not a drawback, but a necessary level of control. In exchange, the remote large-files-holder is just any remote filesystem, accessible directly by its path, via ssh or via multiple drivers (google drive, amazon, etc.). Anyway, hooks are also available for dvc, which would simplify the use of large files, if having additional files is not annoying to one, and saving files to the remote would require additional operations, remember that they are .gitignored! So, if you modify a file stored in dvc, such change will not be noticed by git status, and you might lose such change, except if you make the additional check with dvc.

    Some comparisons with related technologies can be found here.

    Custom scripts

    scripting is the most flexible way to go. It allows you to access all MinIO's API functionality in the language of your preference. To help you get started MinIO offers the user a variety of SDKs. Scripting does imply that you as developer have enough coding skills  are also the maintainer. However with enough real-word examples this option should not be too difficult.

    Note to developers: Please provide your examples to github-support@deltares.nl so we can incorporate them into this manual.

    If you need to contact someone regarding GITHUB then you can do the following:

    When requesting a new GITHUB repository you must provide the following information:

    Summary: Repository name

    Description:

    • Public , Internal or Private (defaults to Public)
    • If this is a code repository:
      • Programming language: Java, C#, Python, Other (please specify)
    • If you require teams:
      • Team name
      • Role: Read, Triage, Write, Maintain or Admin
    • Is repository linked to production ripe product: Yes / No (defaults to no)
      In case you select Yes then also select a PMT value in the PMT field

    Product Owner: Repository administrator

    PMT: Select project management team if repository is linked to one (defaults to 'General')


    Request repository

    Do you have a question regarding GITHUB or one of the repositories, then please provide the following information:

    Summary: Brief description

    Description:

    Post your question here. Be precise and provide any information that can be helpful to our team.

    Attachment: If required add an attachment


    Before you post your question, have you checked our Frequently Asked Questions section?


    Post question




    • No labels