Skip to content

Tags

Tags give the ability to mark specific points in history as being important
This project is mirrored from https://github.com/neondatabase/autoscaling. Pull mirroring updated .
  • v0.13.6
    v0.13.6
    
    Hotfix release fixing the plugin's incorrect usage of fun/pubsub.Queue
    that caused some events to be dropped.
  • v0.13.5
    v0.13.5
    
    Hotfix release fixing the plugin's Score method so that it takes into
    account actual resource usage. This release only contains a backport of
    the fix from #426.
  • list
  • litst
  • v0.13.4
    v0.13.4
    
    Hotfix release to fix two issues:
    
    1. Scheduler plugin's Filter logic was incorrectly counting Completed
       pods into the usage calculations.
    2. Scheduler plugin's node state 'Buffer' field was always underflowing.
    
    The fixes were in #423 and #424, respectively.
  • v0.13.3
    a010f282 · Bump: v0.13.2 -> v0.13.3 ·
    v0.13.3
    
    Another small release, with a minor improvement to the plugin's method
    call metrics, so we can avoid tripping alerts for overprovisioning pods.
    
    Change was in #422, nothing else included.
  • v0.13.2
    v0.13.2
    
    Another small release, primarily to fix the scheduler's handling of
    overprovisioning pods. Also contains a minor improvement.
    
    Fixes:
    
    - plugin: Fix handling of ignored namespaces during Filter (#416, #418)
    
    Other changes:
    
    - plugin: Emit k8s Event on failed ExtractVmInfo (#408)
      - Should help with observability for certain failures.
    
    Upgrade path from v0.12.x / v0.13.x:
    
    - No ordering requirements.
  • v0.13.1
    v0.13.1
    
    Small release, primarily to fix a leak in the scheduler plugin.
    Also contains other bugfixes.
    
    Fixes:
    
    - plugin: Memory leak (#415)
    - plugin: Missing node metrics during initial load (#410)
    - neonvm/runner: Missing error logs (#401)
    - neonvm/runner: Various log.Printf calls with unnecessary trailing newline (#401)
    - neonvm/controller, informant: Typos in error messages (#407)
    
    Upgrade path from v0.12.x / v0.13.0:
    
    - No ordering requirements.
  • v0.13.0
    v0.13.0
    
    This relatively small release contains significant changes to existing
    behavior in both the autoscaler-agent and scheduler plugin.
    
    No breaking API changes (technically).
    
    Features:
    
    - agent: Memory-based scaling (#393)
      - Currently implemented in a similar manner to our load average-based
        scaling, via total memory usage, including the kernel.
    - plugin: Allow ignoring resource usage from namespaces (#399)
      - Carveout for 'overprovisioning' pods now that we're tracking
        everything.
    
    Fixes:
    
    - plugin: Improve plugin method logs (#405)
      - Previously, some notable metrics were being increased without
        suitable accompanying log messages.
    
    No protocol changes.
    
    Other changes:
    
    - plugin: Track all pods (#399)
      - Should make our accounting & metrics reporting much more accurate.
    - plugin: Remove 'System' reserved resources (#399)
      - No longer necessary, because we're tracking everything.
  • v0.12.2
    v0.12.2
    
    Small release, just containing #395 - a fix for #234, where the
    autoscaler-agent's per-VM Runner will panic when the scaling bounds
    decrease below the current usage.
    
    This was fast-tracked for release because of the impact on VM pools.
    It's not hard-blocking, but is significant enough that it's worth
    fixing beforehand.
  • v0.12.1
    v0.12.1
    
    This release contains bugfixes and new metrics (along with some changes
    to existing ones).
    
    No breaking API changes.
    
    Features:
    
    - plugin: New migration-related metrics (#387):
      - autoscaling_plugin_migrations_created_total
      - autoscaling_plugin_migrations_deleted_total
      - autoscaling_plugin_migration_create_fails_total
      - autoscaling_plugin_migration_delete_fails_total
    - plugin: Include node group in node resource metrics (#382)
    - agent: agent->informant request metrics now include the endpoint (#380)
    
    Fixes:
    
    - Add vmscrape.yaml to release assets (#392)
    - plugin: Fix spurious "updated scaling bounds" logs (#391)
      - Incidentally, this *also* entirely fixes our handling of scaling
        bounds changes.
    - plugin: Migration handling reliability improvements (#387)
    - informant: Fix parent process stall when child dies quickly (#389)
    - agent: Fix NeonVM downscaling not showing up in metrics (#381)
    
    No protocol changes.
    
    No other changes.
    
    Upgrade path from v0.12.0:
    
    - No ordering requirements.
  • v0.12.0
    v0.12.0
    
    This release contains bugfixes (lots of them!), new metrics, and
    BREAKING CHANGES TO OLD METRICS.
    
    No breaking API changes.
    
    Features:
    
    - neonvm: Propagate label/annotation changes to runner pod(s) (#279)
    - agent: Add scaling metrics! (#334)
      - All of:
        - autoscaling_agent_scheduler_plugin_{requested,approved}_{cpu,mem}_change_total
        - autoscaling_agent_informant_{requested,approved}_{cpu,mem}_change_total
        - autoscaling_agent_neonvm_requested_{cpu,mem}_change_total
        - autoscaling_agent_neonvm_outbound_requests_total
    - plugin: Add per-node resource metrics (#363)
      - Two new metrics:
        - autoscaling_plugin_node_cpu_resources_current
        - autoscaling_plugin_node_mem_resources_current
    
    Fixes:
    
    - Add whereabouts.yaml to release assets (#348)
    - neonvm: Don't propagate kubectl's last-applied-configuration annotation (#344)
    - agent: Reset Runner endState on restart (#349)
      - This bug caused the agent's metrics to never show a
        previously-panicked Runner as recovered, even when it was.
    - agent/schedwatch: Fix spurious close (#352)
      - This bug was causing agents to be unable to recognize new
        schedulers.
    - plugin/watch: Remove redundant error wrapping (#358)
    - plugin: Fix filter cycle metrics (#356)
      - This REMOVES two metrics:
        - autoscaling_plugin_filter_cycle_successes_total
        - autoscaling_plugin_filter_cycle_rejections_total
      - See the PR for more details.
    - README: fix make commands to reflect kind/k3d (#365)
    - plugin: Cleanup state for deleted k8s Nodes (#361)
      - Should *hopefully* fix a particular memory leak, but it's not clear.
    - informant/filecache: Close DB connections (#367)
      - This was causing some users to be unable to connect to their
        database because the informant took all the connections.
      - This was already released as v0.11.1
    - agent/billing: Move push logic into separate thread (#368)
      - This was preventing us from having more reasonable request timeouts
        (like... anything above 2s)
    
    No protocol changes.
    
    Other changes:
    
    - util/watch: More logs! (#351)
    - agent: Record neon/endpoint-id for each Runner if/when assigned (#353)
    - agent: Improve help message for autoscaling_agent_tracked_vms_current (#354)
    - agent/billing: Log IdempotencyKey of events (#366)
    - billing: Add x-trace-id header to requests (#372)
    
    Upgrade path from v0.11.0:
    
    - No ordering requirements, but considering the fixes to the agent's
      scheduler detection, it's probably worthwhile to update any agents
      first.
  • v0.11.1
    v0.11.1
    
    Hotfix release, backporting #367 to fix a bug in the informant that
    caused it to never close DB connections when the file cache integration
    is enabled.
  • v0.11.0
    v0.11.0
    
    This release contains bugfixes, new features, and large changes to the
    NeonVM controller.
    
    Breaking API changes:
    
    - neonvm: VirtualMachine .spec.extraNetwork fields changed (#256)
      - Removed multusNetworkNoIP
      - Made multusNetwork omitempty
    - neonvm: VirtualMachineMigrations no longer have post-copy enabled by default (#256)
    
    Features:
    
    - neonvm: Two new VmPhase types: "PreMigrating" and "Scaling" (#256)
    - neonvm: Migration source runner pod now has an ownerref pointing back
      to the migration (#332)
    - ci: Added support for k3d (#340)
    - plugin: new metrics
      - autoscaling_plugin_filter_cycle_successes_total (#346)
      - autoscaling_plugin_filter_cycle_rejections_total (#346)
      - autoscaling_plugin_extension_call_fails_total (#347)
    
    Fixes:
    
    - scheduler: Fixed agent-handler log keys explosion (#338)
      - NB: this was already released as v0.10.1
    - scheduler: Fixed missing `continue` when skipping completed pods (#342)
      - NB: this was already released as v0.10.2
    - scheduler: Fixed outdated log line (#343)
      - Removed "[autoscale-enforcer] load state: " prefix from the message
    - agent: Do informant health checks even when suspended (#341)
    
    No protocol changes.
    
    Other changes:
    
    - ci: kind and kubectl versions tweaked (#336)
    - k8s deps upgraded to 1.25.11 (#339)
    - plugin: Capitalize pluginCalls metric labels (#345)
    
    There's even more changes to the NeonVM controller that aren't listed
    here. For more, see #256.
    
    Upgrade path from v0.10.x:
    
    - No ordering requirements.
  • v0.10.2
    v0.10.2
    
    Hotfix release, backporting #342 to fix the scheduler plugin's handling
    of completed pods on startup.
  • v0.10.1
    v0.10.1
    
    Hotfix release, backporting #338 to fix scheduler plugin logs for agent
    requests.
  • v0.10.0
    v0.10.0
    
    This release contains bugfixes, ???, and a breaking change to the
    agent<->informant protocol.
    
    Breaking API changes:
    
    - agent<->informant: Include AgentID in informant /downscale and /upscale (#316)
      - This bumps the agent<->informant protocol to v2.
      - The agent currently supports both versions, and will for the
        immediate future.
    
    Features:
    
    - neonvm/builder: Make output prettier (#280)
    - Start switch from klog -> zap [agent/plugin/informant] (#323)
      - All kinds of dashboards need updating. It's for the best.
    
    Fixes:
    
    - agent/informant: Fix inverted condition for logs (#315)
    - plugin: Handle usage updates for non-autoscaling VMs (#312)
    - plugin: Fix Unreserve condition (#317)
    - util/watch: Set failingCurrent gauge to zero so it shows up (#320)
    - neonvm: Fix default ports from Go client (#257)
    
    Protocol changes:
    
    - See above, re: informant agent<->informant changes.
    
    Other changes:
    
    - deploy: Change metrics scrape interval 10s -> 60s (#321)
    - neonvm/runner: Set AutomountServiceAccountToken = false (#298)
    - agent/billing: Use NeonVM .status.cpus, not .spec.guest.cpus.use (#325)
    
    Upgrade path from v0.9.0:
    
    - All autoscaler-agents must be upgraded before any vm-informants
    - No other requirements.
  • v0.9.0
    v0.9.0
    
    This release contains bugfixes and upgrades to Kubernetes 1.25.
    
    Breaking API changes:
    
    - Upgrading to K8s 1.25. NB: Autoscaling requires K8s control planes
      with a version equal or +1; i.e. K8s 1.25 OR 1.26 is not required.
    
    Features:
    
    - New metrics! (#306, #310)
      - Too many to cover here; refer to those PRs intead.
    
    Fixes:
    
    - util/watch: Fix race condition on k8s watch.Update events (#295)
    - agent/informant: Fix informant server exit logs (#286)
    - api: Fix ExtractVmInfo disallowing min > use or use > max (#303)
      - this one may be counterintuitive at first. See #249 for context
    - agent: Fix vmEvent formatting (#307)
    - informant: Suspend old agent *before* new one (#308)
    - util/watch: Fix racy behavior with InitModeDefer (#305)
      - This was causing billing events to not be generated for VMs until an
        event *after* startup occurs for them.
    - plugin: Allow overcommitted nodes on startup (#313)
    - agent: Stop SchedulerWatch when Runner finishes (#314)
      - This was preventing the switchover to a new scheduler on upgrade or
        restart
    
    Other changes:
    
    - Fix yaml formatting for autoscaler-agent config deploy (#300)
    
    No protocol changes.
    
    Upgrade path from v0.8.0:
    
    - No ordering requirements.
  • v0.8.0
    v0.8.0
    
    This release contains bugfixes, a new component, minor public-facing API
    changes, and significant changes to the deployed services, but no
    inter-component API changes.
    
    Breaking API changes:
    
    - NeonVM: restart policy no longer applies directly to the pod (#293)
    
    Features:
    
    - Add patch for cluster-autoscaler compatability with VMs (#232)
    - NeonVM: implement RestartPolicy (#293)
    - NeonVM security and networking redesign (#245)
      - Runner pod no longer has Privileged: true
      - QEMU in the runner pod runs under its own user
      - Adapted generic-device-plugin for NeonVM, to give access to /dev/kvm
        and /dev/vhost-*
      - Switch from neonvm-vxlan-ipam to Whereabouts CNI
        -> Allows using overlay IP addresses in normal pods as well as VMs
      - Reconcile cycles improved
    - NeonVM/vm-builder: Add --enable-file-cache flag (default: off) (#265)
    - NeonVM: user RBAC roles (#284):
      - neonvm-virtualmachine-viewer-role
      - neonvm-virtualmachine-editor-role
      - neonvm-virtualmachinemigration-viewer-role
      - neonvm-virtualmachinemigration-editor-role
    - More logs for autoscaler-agent (#290, #291)
    - More autoscaler-agent metrics:
      - autoscaling_agent_runner_starts   (#273)
      - autoscaling_agent_runner_restarts (#273)
      - autoscaling_agent_runner_fatal_errors_total (#274)
      - autoscaling_errored_vm_runners_current      (#274)
    
    Fixes:
    
    - NeonVM/vm-builder: Fix command passthrough (#263)
    - NeonVM/vm-builder: Fix cgexec being ignored (#281)
    - NeonVM/vm-builder: Build without cgo (#255)
      - This removes the dependency on a dynamically loaded libc.
    - informant: Fix cgroup memory.high throttling (#223)
    - agent: Various logs fixes (#242, #267, #271, #272)
    - agent: Restart panicked/errored runners (#273)
    - agent/billing: Don't count VMs that aren't runnnig (#278)
    - agent, sched: Add ports to pod spec for metrics (#282)
    - agent, sched: Fix logging of MilliCPU (#261)
    - sched: Don't output command help on error (#253)
    - plugin: Handle completed pods as if deleted (#260)
    
    No protocol changes.
    
    Other changes:
    
    - Many unused RBAC (and other) items removed:
      - Namespace autoscaler-config (#245)
      - ClusterRole vm-view (#284)
      - ClusterRole vm-patcher (#284)
      - ClusterRoleBinding kube-system/autoscaler-vm-view (#284)
      - ClusterRoleBinding kube-system/autoscale-scheduler-as-vm-patcher (#284)
      - Role kube-system/autoscale-scheduler-config-reader (#284)
      - RoleBinding kube-system/autoscale-scheduler-config-reader (#284)
    - NeonVM: Rename 'runner' container to 'neonvm-runner' (#277)
    - agent: Network error metrics include root cause (#287)
    
    Upgrade path from v0.7.2:
    
    - No ordering requirements.
    - You may wish to remove old items as mentioned above.
  • v0.7.3-alpha3
    v0.7.3-alpha3
    
    This is a pre-release just for building and distributing images.
    Do not deploy anything from this release.