Add architecture overview #4
No reviewers
Labels
No labels
bug
change
duplicate
enhancement
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
servala/documentation!4
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "overall-architecture"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
WIP: Add architecture overviewto Add architecture overview6dc8deec03tob171f10143Do we need to add some notes on how Project Syn plays its part?
@ -0,0 +20,4 @@| Cluster Type | Purpose ||--------------|---------|| **Management Cluster** | Hosts the Servala Portal, centralized monitoring, and alerting infrastructure || **Workload Clusters** | Run customer workloads and AppCat service instances |"Run customer service instances and AppCat control-plane"
@ -0,0 +22,4 @@| **Management Cluster** | Hosts the Servala Portal, centralized monitoring, and alerting infrastructure || **Workload Clusters** | Run customer workloads and AppCat service instances |All clusters run Talos Linux and are deployed using Terraform/OpenTofu across multiple cloud providers. The architecture is designed for secure, private access with no public Kubernetes API exposure.Should we already mention the fact that there might be CSP offering "Managed Talos" like Hidora/Hikube where we potentially use their service instead of bringing our own?
I have added a little note which states that we are open to it if the CSP offers a managed Kubernetes based on Talos
@ -0,0 +24,4 @@All clusters run Talos Linux and are deployed using Terraform/OpenTofu across multiple cloud providers. The architecture is designed for secure, private access with no public Kubernetes API exposure.All clusters run a highly available control plane consisting of 3 master nodes. Worker nodes are scaled manually based on capacity needs."Worker nodes are currently scaled manually based on capacity needs."
@ -0,0 +30,4 @@AppCat is the core component of the Servala service catalog, built on [Crossplane](https://www.crossplane.io/). It runs on every cluster and enables the provisioning and management of cloud-native services for customers.For detailed information on AppCat's architecture and capabilities, see the [AppCat documentation](https://docs.appcat.ch/index.html).Maybe also link to https://kb.vshn.ch/app-catalog/ which is more technical and about the architecture
done
@ -0,0 +38,4 @@### CIDR Allocation StrategyTo enable potential future mesh connectivity, each cluster receives non-overlapping network ranges:Do you want to elaborate here what constraints these prefix length will impose? Aka what that means in terms of max nodes, pods or services per cluster and the max. number of clusters supported?
Added what these CIDRs imply
@ -0,0 +87,4 @@### Audit LoggingKubernetes API server audit logging is enabled on all clusters to track who did what and when. Audit logs are collected centrally alongside other cluster logs.I think we need to track this in a Jira issue so that we do not forget to set it up =)
done, it's in our board
@ -0,0 +95,4 @@## Naming Conventions### Cluster NamesCan you elaborate where this cluster name is used? I think this is a Project Syn specific thing? Why do we need this
c-servalaprefix?I just stuck with it since it makes it a bit easier for switching between the syn related stuff and my infra configs. I personally don't mind the prefix at all.
I'm OK with keeping the prefix. I'd still love to have a sentence added what this cluster name is used for, because for example it differs from the DNS names.
@ -0,0 +127,4 @@| Group | Pattern | Examples ||-------|---------|----------|| Control plane | `master-[ID]` | `master-904e`, `master-8beb`, `master-dcb8` || Workers | `worker-[ID]` | `worker-e852` |Do we care about the worker type in the naming scheme? For example, on Cloudscale we might have plus and flex workers.
My intention with this was to keep workers naming scheme the same across the cluster no matter what compute flavour it uses. I am using nodelabels to distinguish this, for example:
node.kubernetes.io/instance-type: plus-16-4@ -0,0 +129,4 @@| Control plane | `master-[ID]` | `master-904e`, `master-8beb`, `master-dcb8` || Workers | `worker-[ID]` | `worker-e852` |## Cluster ProvisioningI think this section belongs into a separate documentation. This is not architecture, this is more a "how to" or "Runbook".
@ -0,0 +276,4 @@## Image ManagementContainer images are pulled from public registries (e.g., ghcr.io for AppCat components). [Spegel](https://github.com/spegel-org/spegel) is deployed on each cluster to provide peer-to-peer image sharing between nodes, reducing external registry pulls and improving pull performance.Yay! Is this part of Talos or do we install / manage it?
it's something we slap on top
@ -0,0 +285,4 @@| Scenario | Solution ||----------|----------|| CSP provides block storage | Use native CSP storage with appropriate CSI driver || CSP lacks storage options | Deploy Rook Ceph for software-defined storage |I would go one step further: The CSP must support CSI, otherwise we can't work with them. I don't think we should run Rook tbh. This is a qualification step for the CSP: No CSI? No Servala.
I am more than fine with this statement. Initially I only included it in the document because I was told that for example the performance of Exoscale's block storage was not great but it has been a while since that was assessed. I still need to find out but I think it'll be good enough.
Yes, I'll discard the Rook part. That makes things a lot easier for us.
b171f10143to38e3b6e05a38e3b6e05atode5fba5e2a