HPCasCode merge requestshttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests2021-09-07T11:37:26+10:00https://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/501updated configureudev script2021-09-07T11:37:26+10:00Andreas Hamacherupdated configureudev scriptAndreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/500Fix slurm secondary server2021-08-25T07:11:24+10:00Trung NguyenFix slurm secondary serverAndreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/499making sure gluster op.version is set. unfortunately via manual intervention ...2021-08-24T21:47:47+10:00Andreas Hamachermaking sure gluster op.version is set. unfortunately via manual intervention after force failhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/498requesting config from secondary controller via tags=2021-08-24T11:05:31+10:00Andreas Hamacherrequesting config from secondary controller via tags=merged with DL during Screensharemerged with DL during ScreenshareAndreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/497Disable ipv62021-08-04T16:13:40+10:00Andreas HamacherDisable ipv6I have tested this role on massive004 and massive002 so centos7 and ubuntu. the test " posting below might be not nice to look at :(
before:
```
[ec2-user@massive004 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state ...I have tested this role on massive004 and massive002 so centos7 and ubuntu. the test " posting below might be not nice to look at :(
before:
```
[ec2-user@massive004 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth00: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:bb:f8:1f brd ff:ff:ff:ff:ff:ff
inet 172.16.204.66/21 brd 172.16.207.255 scope global noprefixroute dynamic eth00
valid_lft 257227sec preferred_lft 257227sec
inet6 fe80::f816:3eff:febb:f81f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: mlx0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4200 qdisc mq state UP group default qlen 1000
link/ether 72:63:dc:e4:4c:03 brd ff:ff:ff:ff:ff:ff
inet 172.16.197.66/21 brd 172.16.199.255 scope global noprefixroute mlx0
valid_lft forever preferred_lft forever
inet6 fdfd:eb1a:eb10:c000:7063:dcff:fee4:4c03/64 scope global noprefixroute dynamic
valid_lft 2591787sec preferred_lft 604587sec
inet6 fe80::7063:dcff:fee4:4c03/64 scope link noprefixroute
valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
after:
[ec2-user@massive004 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth00: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:bb:f8:1f brd ff:ff:ff:ff:ff:ff
inet 172.16.204.66/21 brd 172.16.207.255 scope global noprefixroute dynamic eth00
valid_lft 257080sec preferred_lft 257080sec
3: mlx0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4200 qdisc mq state UP group default qlen 1000
link/ether 72:63:dc:e4:4c:03 brd ff:ff:ff:ff:ff:ff
inet 172.16.197.66/21 brd 172.16.199.255 scope global noprefixroute mlx0
valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
after reboot:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth00: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:bb:f8:1f brd ff:ff:ff:ff:ff:ff
inet 172.16.204.66/21 brd 172.16.207.255 scope global noprefixroute dynamic eth00
valid_lft 259165sec preferred_lft 259165sec
3: mlx0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4200 qdisc mq state UP group default qlen 1000
link/ether 72:63:dc:e4:4c:03 brd ff:ff:ff:ff:ff:ff
inet 172.16.197.66/21 brd 172.16.199.255 scope global noprefixroute mlx0
valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:1b:36:7b brd ff:ff:ff:ff:ff:ff
ubuntu@massive002:~$ ip a before
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth00: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:50:9a:f9 brd ff:ff:ff:ff:ff:ff
inet 172.16.200.246/21 brd 172.16.207.255 scope global dynamic noprefixroute eth00
valid_lft 160699sec preferred_lft 160699sec
inet6 fe80::f816:3eff:fe50:9af9/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: mlx0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4200 qdisc mq state UP group default qlen 1000
link/ether 72:63:20:37:17:06 brd ff:ff:ff:ff:ff:ff
inet 172.16.193.246/21 brd 172.16.199.255 scope global noprefixroute mlx0
valid_lft forever preferred_lft forever
inet6 fdfd:eb1a:eb10:c000:9d8b:61b:c6f6:fee3/64 scope global temporary dynamic
valid_lft 603310sec preferred_lft 84316sec
inet6 fdfd:eb1a:eb10:c000:75de:833b:6025:8f5e/64 scope global temporary deprecated dynamic
valid_lft 517506sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:fca2:b148:457f:890/64 scope global temporary deprecated dynamic
valid_lft 431701sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:b586:cde2:fed:dfed/64 scope global temporary deprecated dynamic
valid_lft 345896sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:7c46:c491:48fc:2aa4/64 scope global temporary deprecated dynamic
valid_lft 260092sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:d1dc:59d0:e98:7f15/64 scope global temporary deprecated dynamic
valid_lft 174287sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:11a8:b743:c68:a2cd/64 scope global temporary deprecated dynamic
valid_lft 88482sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:4c18:8c37:1447:ae70/64 scope global temporary deprecated dynamic
valid_lft 2678sec preferred_lft 0sec
inet6 fdfd:eb1a:eb10:c000:7063:20ff:fe37:1706/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591952sec preferred_lft 604752sec
inet6 fe80::7063:20ff:fe37:1706/64 scope link noprefixroute
valid_lft forever preferred_lft forever
ubuntu@massive002:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth00: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:50:9a:f9 brd ff:ff:ff:ff:ff:ff
inet 172.16.200.246/21 brd 172.16.207.255 scope global dynamic noprefixroute eth00
valid_lft 160675sec preferred_lft 160675sec
3: mlx0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4200 qdisc mq state UP group default qlen 1000
link/ether 72:63:20:37:17:06 brd ff:ff:ff:ff:ff:ff
inet 172.16.193.246/21 brd 172.16.199.255 scope global noprefixroute mlx0
valid_lft forever preferred_lft forever
```Trung NguyenTrung Nguyenhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/496modifying extra packages to include vars files from playbook level2021-07-28T11:50:56+10:00Andreas Hamachermodifying extra packages to include vars files from playbook levelI don't like to always have to run include_vars. Especially this large one is costly.
With starting to support ubuntu AND CentOS I am avoiding a second If else style hereI don't like to always have to run include_vars. Especially this large one is costly.
With starting to support ubuntu AND CentOS I am avoiding a second If else style hereSimon MichnowiczSimon Michnowiczhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/495add a role for OpenStackImageBugfixs2021-07-28T10:35:14+10:00Andreas Hamacheradd a role for OpenStackImageBugfixsThe purpose of this role is to replace lines 25-28 of
[computenodes.yml](https://gitlab.erc.monash.edu.au/hpc-team/clusterbuild/-/blob/master/computenodes.yml)
and not run them always. I also need this code in HaC because I am using th...The purpose of this role is to replace lines 25-28 of
[computenodes.yml](https://gitlab.erc.monash.edu.au/hpc-team/clusterbuild/-/blob/master/computenodes.yml)
and not run them always. I also need this code in HaC because I am using the same base image.Simon MichnowiczSimon Michnowiczhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/494quotes ansible bugfix.2021-07-28T11:20:16+10:00Andreas Hamacherquotes ansible bugfix.adding quotes to make the ansible parser happyadding quotes to make the ansible parser happySimon MichnowiczSimon Michnowiczhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/493deploying gres.conf in role slurm_config2021-07-27T12:25:27+10:00Andreas Hamacherdeploying gres.conf in role slurm_confighttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/492Enroot2021-07-26T16:37:28+10:00Simon MichnowiczEnrootInstall enroot on a node.Install enroot on a node.Andreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/491Bumpversions2021-09-29T13:10:49+10:00Andreas HamacherBumpversionshttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/490logrotate role to compress rotated files2021-07-26T12:37:26+10:00Andreas Hamacherlogrotate role to compress rotated filesCR: https://jira.apps.monash.edu/browse/HCM-82
tested on massive004(entos) and massive002(Ubuntu) via
```
sudo logrotate --force /etc/logrotate.d/syslog
sudo logrotate --force /etc/logrotate.conf
```
before:
```
-rw------- 1 root ...CR: https://jira.apps.monash.edu/browse/HCM-82
tested on massive004(entos) and massive002(Ubuntu) via
```
sudo logrotate --force /etc/logrotate.d/syslog
sudo logrotate --force /etc/logrotate.conf
```
before:
```
-rw------- 1 root root 0 Jul 25 03:25 messages
-rw-------. 1 root root 3247768 Jul 19 15:22 messages-20210719
-rw------- 1 root root 0 Jul 19 16:42 messages-20210725
```
after:
```
-rw------- 1 root root 0 Jul 26 12:24 messages
-rw-------. 1 root root 3247768 Jul 19 15:22 messages-20210719
-rw------- 1 root root 0 Jul 19 16:42 messages-20210725
-rw------- 1 root root 20 Jul 25 03:25 messages-20210726.gz
```Trung NguyenTrung Nguyenhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/489ubuntu20 desktop changes2021-07-29T10:46:15+10:00Andreas Hamacherubuntu20 desktop changesAs proof-of-concept, we can now run Ubuntu 20.04 desktop on our cluster.
We used the m3t000 machine for this, and the list of the changes that we need to do are below (reason for change in brackets):
- [x] 1. Remove python-is-python3
- ...As proof-of-concept, we can now run Ubuntu 20.04 desktop on our cluster.
We used the m3t000 machine for this, and the list of the changes that we need to do are below (reason for change in brackets):
- [x] 1. Remove python-is-python3
- [x] 2. Set default python to python2: `sudo update-alternatives --install /usr/bin/python python /usr/bin/python2 1` (to get get-xorg.py to work on Ubuntu 20.04, only works with python2)
- [ ] 3. Add `source /etc/profile.d/modulecmd.sh` to the top of the desktop.slurm or sbatch_vis_session (to get module to work in the slurm script)
- [x] 4. Edit /etc/X11/Xwrapper.config and change allowed_user variable to `allowed_user=anybody` (this is to get xinit to work, by default it is locked to console only so can't start xterm via ssh without changing this config)
- [x] 5. Run `dpkg-reconfigure xserver-xorg-legacy` to make sure /etc/X11/Xwrapper.config doesn't get overridden during update (see the conf file for more detail)
- [x] 6. Update `/usr/local/desktop/desktop_start_arg` to replace hard-coded lspci with `which lspci` (this is a legacy script from Paul Mac, all the GUI apps use this script to start - this is now done, changes merged)
- [x] 7. Install python-tk package (this is to get the Desktop Walltime script to work)
- [x] 8. Add the below code to /etc/bash.bashrc (this is to get module command to work from terminal on the desktop)
```
if [ -f /etc/profile.d/modulecmd.sh ]; then
. /etc/profile.d/modulecmd.sh
fi
```
All except no. 6 needs to be put int ansible
I am testing a couple of applications that aren't working (ChimeraX 0.91 seems to work on P4 but not T4, but ChimeraX 0.93 works on both) and if you want to play around, let me know.
I tested this using Strudel Desktop, so I will still need to test this on Strudel 2.Chris HinesChris Hineshttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/488Kernelupdate2021-07-19T14:48:47+10:00Andreas HamacherKernelupdatebumping default lustre to latest LTS version and some quotation bugfixesbumping default lustre to latest LTS version and some quotation bugfixeshttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/487typo. wrong Redhat, right RedHat2021-07-19T12:54:36+10:00Andreas Hamachertypo. wrong Redhat, right RedHatI am self merging this one without review since no logic is change. bugfix onlyI am self merging this one without review since no logic is change. bugfix onlyAndreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/486adding a role to mount from ceph2021-07-13T16:35:33+10:00Andreas Hamacheradding a role to mount from cephpre-approved by CH see
https://gitlab.erc.monash.edu.au/hpc-team/clusterbuild/-/merge_requests/839pre-approved by CH see
https://gitlab.erc.monash.edu.au/hpc-team/clusterbuild/-/merge_requests/839Andreas HamacherAndreas Hamacherhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/485Fitdgx2021-07-13T16:33:30+10:00Andreas HamacherFitdgxTrung NguyenTrung Nguyenhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/484ibv_devinfo does have a common path between RHEL and Ubuntu2021-07-13T11:47:48+10:00Andreas Hamacheribv_devinfo does have a common path between RHEL and UbuntuI am removing an unnecessary when clause here ( when is a bit similar to if else )I am removing an unnecessary when clause here ( when is a bit similar to if else )Jay Van SchyndelJay Van Schyndelhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/483just some more reasonable service restarts2021-07-13T10:35:28+10:00Andreas Hamacherjust some more reasonable service restartsTrung NguyenTrung Nguyenhttps://gitlab.erc.monash.edu.au/hpc-team/HPCasCode/-/merge_requests/482role to pin packages on ubuntu using apt preference files2021-07-13T16:33:49+10:00Andreas Hamacherrole to pin packages on ubuntu using apt preference filessmall new role implementing what Swe showed us today
tested on m3t000small new role implementing what Swe showed us today
tested on m3t000Trung NguyenTrung Nguyen