commit cd9d3eef5209c1367b06019102beef2818cc8c20
Author: Greg Althaus <galthaus@austin.rr.com>
Date: Sat May 23 13:00:13 2020 -0500
build: fix unit tests to be more stable.
M backend/profiles_test.go
M go.mod
M go.sum
commit 51e707c6d22270aa375709ae8518afd76e2eec50
Author: Victor Lowther <victor.lowther@gmail.com>
Date: Thu May 21 17:35:01 2020 -0500
fix(startup): Delete invalid WAL files that do not have data
If dr-provision was shut down while it was trying to create a new WAL
segment but was unable to do so due to a disk full condition, the
system can be left in a state where manual intervention is required to
come back up even after the disk space issue has been resolved. This
patch recognizes a few of those invalid WAL segment conditions and
just deletes the invalid WAL segment instead of erroring out.
M clitest/bootenv_test.go
M datastack/stack.go
M datastack/streamingSyncPassive.go
M datastack/wal.go
commit c10194d0b33dce9b7b34714f78e6aff5ba073bf6
Author: Victor Lowther <victor.lowther@gmail.com>
Date: Wed May 20 16:53:22 2020 -0500
fix(ipxe): Bump ipxe version to 1.20.1
We haven't built ipxe from scratch for a couple of years. Bump it to
get crunchy new hardware and bugfixes.
M embedded/assets/ipxe-arm64.efi
M embedded/assets/ipxe.efi
M embedded/assets/ipxe.lkrn
M embedded/assets/ipxe.pxe
commit b70973ee5b35a5f9f798b20396fdb0a4df7e24d1
Author: Victor Lowther <victor.lowther@gmail.com>
Date: Wed May 20 16:38:42 2020 -0500
fix(bootloader): Fix crash when bootloaders is not fully populated
The bootloader chooser code crashed when only a subset of the
default bootloaders was overridden. This patch fixes that.
M backend/bootenv.go
commit e96d3563f5611111c36714636f4917a1da517a03
Author: Greg Althaus <galthaus@austin.rr.com>
Date: Wed May 20 12:00:22 2020 -0500
fix(datastack): make sure content pack name is loaded
M datastack/stack.go
commit 4e8dbcb5925cbeb468068a1e7caffa57b6571857
Author: Victor Lowther <victor.lowther@gmail.com>
Date: Mon May 18 12:45:21 2020 -0500
perf(dhcp): Improve DHCP parallelism in packet processing.
DHCP packet processing has (until this commit) has the ability to
bottleneck the rest of the system, due to the midlayer DHCP packet
handling code having no ability to throttle itself and the backend
DHCP lease handling code being effectively serialized behind a mutex.
Thsoe two issues in combination with excessively slow WAL flushes and
a DHCP broadcast storm can combine to cause the DHCP server to
effectively deadlock everything else and spin up extra goroutines
until the server runs out of memory.
This patch fixes both of those issues.
In the midlayer, we will now only handle one DHCP packet request/reply
cycle per macaddress at a time. This will help handle DHCP
performance during broadcast storms and cases where we are reciving
multiple incoming packets from the same server due to misconfigured
relays, switch misconfiguration, or whatever. Any duplicate packets
from the same MAC address (as determined by MAC address and DHCP
message type) are now ignored entirely. This sets the upper bound of
goroutines that the DHCP midlayer code can have outstanding to be
equal to the number of simultaneous DHCP clients trying to exchange
packets woith it instead of the upper bound being equal to the number
of packets recieved. Along the way, un-export a few types and methods
that the outside world does not need to know about.
In the backend, the DHCP utility functions are serialized based on
DHCP strategy and token instead of serializing behind a single mutex.
M backend/dhcpUtils.go
M backend/dhcpUtils_test.go
M midlayer/abp.go
M midlayer/dhcp.go
M midlayer/dhcpUtil.go
M midlayer/dhcp_test.go
M midlayer/fake_midlayer_server_test.go
M midlayer/onie.go
M midlayer/pxe.go
commit c02aa1363637d189ab2f31394df1685064470fad
Author: Victor Lowther <victor.lowther@gmail.com>
Date: Mon May 11 11:34:12 2020 -0500
feat(machines): Add a HardwareAddr index to Machines.
Recent events have shown us that relying on the machine Address to be
populated, sane, and up-to-date is not a winning strategy in the face
of DHCP servers that are not under out control -- if those DHCP
servers are oversubscribed, then it is possible for the IP address of
a system to change any time it is not up and running and under our
control. However, we still need to be able to create new Machines,
and we need a better method of checking whether or not a Machine is
already present based on a set of known criteria. Probably the best
set of known criteria is the set of MAC addresses present on the
system, but we don't have a good way of querying for nodes based on
that information. As a starting point, add an Index to Machines that
allows us to see what Machines have HardwareAddrs we are interested
in.
While we are at it, make machine validation use the new Index to make
sure machines do not have overlapping HardwareAddrs, and stop
uniqueness checking on Machine.Address if skipIPBasedBooting is true.
M backend/machines.go
M clitest/test-data/output/TestCorePieces/machines.indexes/stdout.expect
End of Note