From patchwork Tue Apr 12 22:35:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Schmidt, Adriaan" X-Patchwork-Id: 1675 Return-Path: Received: from shymkent.ilbers.de ([unix socket]) by shymkent (Cyrus 2.5.10-Debian-2.5.10-3) with LMTPA; Wed, 13 Apr 2022 08:36:00 +0200 X-Sieve: CMU Sieve 2.4 Received: from mail-qk1-f186.google.com (mail-qk1-f186.google.com [209.85.222.186]) by shymkent.ilbers.de (8.15.2/8.15.2/Debian-8) with ESMTPS id 23D6ZwCn001728 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 13 Apr 2022 08:35:59 +0200 Received: by mail-qk1-f186.google.com with SMTP id bq6-20020a05620a468600b00699e63cfda4sf558751qkb.20 for ; Tue, 12 Apr 2022 23:35:59 -0700 (PDT) ARC-Seal: i=3; a=rsa-sha256; t=1649831753; cv=pass; d=google.com; s=arc-20160816; b=wOKyWgqRCt2CJ+PA+d5B7U5ilG99NYX+MXgoVA6bS1bm9cJ7RH7nXbP4T1ftjpdFLh 41Tz4Y1/ekYoAn5cOgDBnSx8DXwtXjTlSyl6I1AezoZGThYhpmT6wJtJhJhbMfuit/xd rnw7x5drB5YFJ7vEHlS4LU5XcBAfW3lFikUrSJOB6RD9mhn1y/8CmtMtZv3M6biUKZzi r5ee/cXrFLrd3Uzubodh9+ZM52PJ/uhqZSJ7NRHWz7yX+GJieA7+3OpcNcYuchgC0ii5 4RrH7vflixGfMKLRW++qn7FC+1bP4h4jfx+9Rm1gu7KxQpm7WnbeAknAE2tMSCY6fos/ n59A== ARC-Message-Signature: i=3; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:dkim-signature; bh=4nmcEa0e6YKLM3ESEBI9Wtz4fBJhTBGHWdRhcP6CMZw=; b=pU6nLHK5p1e7SLvPrq3tm/IQOoUBxv8DkYRqrvcB/ZHhBqprkzsrr8TxXyhx77ZChu uaLJveIFcQG5wUweOGPoSg4Fmhkm1mF9m9vKfh5V91xYrVWHHGikqZgL9fU31M/mQ7tg GBfNadjdJusPhLzfgpXIhTtZhNdyZWod7joiJau86SOsYcRTTPkywIjb7mDqTbua+mUc 9X2Jy0apD0ZcO3MJYYy4TAm4kX9bNfUeDWba7vYnfDHtU+C1fqDfRU/QKbpBd++AjR95 jWPA9c7iFCSzKn1dRlwkbiq547jEp2TJ1Twdoh3f6b12nf8E+aJi392wse1u1g64GvDW 3nAw== ARC-Authentication-Results: i=3; gmr-mx.google.com; dkim=pass header.i=@siemens.com header.s=selector2 header.b=aNfX2hc6; arc=pass (i=1 spf=pass spfdomain=siemens.com dmarc=pass fromdomain=siemens.com); spf=pass (google.com: domain of adriaan.schmidt@siemens.com designates 2a01:111:f400:7d00::622 as permitted sender) smtp.mailfrom=adriaan.schmidt@siemens.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=siemens.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:x-original-sender:x-original-authentication-results :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=4nmcEa0e6YKLM3ESEBI9Wtz4fBJhTBGHWdRhcP6CMZw=; b=NInnpI0esCz102hLuP4rh9WowD6/RJlGaKimLGsOcpBhiQWmGGk4Cc6Tvkn6WZEXmL CSPgiNLhT2nnhCtkd4xYpo94M08FqceS2GNYm9a76xMfs0fiYf1DdYPstk5EWdXBv+wm 0LzqJXoZ3sjeUzwbGVQI3AWSCX22k6tLOZtl863yKQjHgQUSYHdnKygQe+OcyrPWLcJs wEJOJcJpFip7/4uCjICMxZsTMXNZ2tbzTnPSLkWsUZPLmwRWMDQV0jczjaA5wT09FPGJ Drlc7OAt2inY2MD6kMHQlaflICqWkZNLFv2XluNROC7KH0UIGuC8PQQhV25JsWjD2Iwa Hj7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=4nmcEa0e6YKLM3ESEBI9Wtz4fBJhTBGHWdRhcP6CMZw=; b=g5vmX/WIVZfJFaB5StxCRh9vLJFH1rns1J6uQjK/4D7O/Qih2onwiurpUF6rD+APJT q8NWgdxsh/JWD2eTypU2d4/9HOwyxD4+iqlxceHZxWjsvGSTvc49qTQD/baqdIdOFX3D jBkps6uRB2v3tvId8R70OemJDchpU8cSI9hbOWngw7mlnnutHhFnqclmn8nZhUk18XTP FQT8Jgm5oJPk2ytiG5k0S/CJHmjdcwVK7yxcC0p7QqnUwgJAyMcTxs4/0pJDWfF0pSZ8 f4NRY4WYQYK3BxzVY6koG50cGF2xpQQsroJTHaND/3qPKshQrbXl9oQf4hiQQ9NeHFaw 1WVQ== Sender: isar-users@googlegroups.com X-Gm-Message-State: AOAM532I+d2zCq4Rn/6UUKzoeCnKG7siJFkiDLlZuVEN/k+P0FlNLIFz AsRDneFdBknLx4xgXC1KGdk= X-Google-Smtp-Source: ABdhPJxeGCbUfyQicoQNswMfUfA7ScSKb1YCwriw1+44JZ7QirplL1342x2dyBZmLHowibCtUy7iPQ== X-Received: by 2002:a05:622a:181:b0:2e1:e70a:ec2a with SMTP id s1-20020a05622a018100b002e1e70aec2amr6121327qtw.42.1649831753327; Tue, 12 Apr 2022 23:35:53 -0700 (PDT) X-BeenThere: isar-users@googlegroups.com Received: by 2002:a05:6214:1873:b0:444:4d6b:1c1a with SMTP id eh19-20020a056214187300b004444d6b1c1als705101qvb.9.gmail; Tue, 12 Apr 2022 23:35:52 -0700 (PDT) X-Received: by 2002:a05:6214:766:b0:441:a5df:8ace with SMTP id f6-20020a056214076600b00441a5df8acemr33936468qvz.87.1649831752719; Tue, 12 Apr 2022 23:35:52 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1649831752; cv=pass; d=google.com; s=arc-20160816; b=YA5e3I6YDTQHgCtUTAWVddEltcUSLvih3mFV+mDBfAtPdGBAg225Ll5smC8Tka4gQu PpevnO2Q0o3VAbc2JHzL/Jo6QB5kSMjKJM9i17xFGhhvm5CD30atEiYv2Es22xf4hjwg Ifiqx4rYZZSx9MAAh7Ocjb9Blv3JWCeSQiCFQSvxgpyXqet6C40mn4WmeK0tKy3WeDmC FP5coaju/gcIx1LD9ItJiIAQCpRLdWcAgz3vwmEM4tvQBze/XROK7mbbTc0PhLa5OP+Y BQbR43xCUcH/D7Kk1KIFus8VH20c/JTG6T9OUkerZ81VNZeHQ94RvNnd1JhC5VQ+scu1 jZ0w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=XsGffLhpMlCR7H+KQD3tG0zhxkIcwnT07JNxKe8WjLo=; b=cjmqYhyV3vU0EXrIsEKrSoil706lCLCu47OwQaeU0lkbTbLGBfuq3cK7kNJYCAdKE5 TvO8iIE24ALC+CEEOtnQv+pUvfca72FJKqrd5wYZklwXDouWDiNkSotJScW7Tdw7zXHH 8q3mwAlaC83AvGTkgylE2HomdsLYVvMfet32mftTLTee5i3EcRyku2VXpXBNX9RKzBeY 85lJ94i8WgZYxbweIpbO7g7VhAtYkiQQNbsR/Q9iCmx83m+aRLRYhFrLf75ySJ3mekwI gFETfoPY47ip2xGuCNSExNkQD6fN24tSAlyC6iyBMjnxdcwJCDYbwD0S0RzGJWbTcldz 6IdA== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@siemens.com header.s=selector2 header.b=aNfX2hc6; arc=pass (i=1 spf=pass spfdomain=siemens.com dmarc=pass fromdomain=siemens.com); spf=pass (google.com: domain of adriaan.schmidt@siemens.com designates 2a01:111:f400:7d00::622 as permitted sender) smtp.mailfrom=adriaan.schmidt@siemens.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=siemens.com Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on20622.outbound.protection.outlook.com. [2a01:111:f400:7d00::622]) by gmr-mx.google.com with ESMTPS id d186-20020a3768c3000000b0067f5b0a0131si1318901qkc.2.2022.04.12.23.35.52 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Apr 2022 23:35:52 -0700 (PDT) Received-SPF: pass (google.com: domain of adriaan.schmidt@siemens.com designates 2a01:111:f400:7d00::622 as permitted sender) client-ip=2a01:111:f400:7d00::622; ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ArgVzlAFshzw7rsrhPnquFN5H2mjMc239fM+WAXZghTSAgV56+xpg24AGWokrXMrbhXJ0dGz7W/OabFtPJ87m/+D/gcGIxDVXiXJwURuPs3gy2HLayh9ms/WKIZFzGuZVpSp8b9k+fniBCUwl9CiuB8XoNfvlKyknWtzM/6Z2QHT4jyWIPDkjSpOOlgp3Api2Jj8ed2Hk4bikEXfI2z1JXTIs2r98GU8R6ek/nHwI3gf+sWu2cn/uT+1lIcJgs5YL6yTyrGPqyXvpS+edELpnBCBL7dm1MAL8oURcOQsw3xMrwmM3zcUnIDR+cOefQk//0DhmijAS0qw8lY7bycKMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XsGffLhpMlCR7H+KQD3tG0zhxkIcwnT07JNxKe8WjLo=; b=SpaxuleFtKCSYhQRbWKQmsufXLAau17vqHiBBN5TiL4dE204EKn0UPBAIHQx6Yha1kuSxDMJc2hz9A0nAxUwRjIAZbrFumQnn7G7Bp/Ner4d1LPNHDcVZADEvi9iVe5tsCgJPtJUSWD/l+HW/GgKiE36Y3LBrve/00kAd6DbrXcMXkl/JwEoWUfbsOIZwNORKlXqPZC5dErWe8WSoZ6yznDLsZEoKe6BpnsrX1R9AfgeLOmSK2hle8FFsK/b101NlzLakn6Cam0Rc/BFVylVsatKCVQ844AaFdM9Bxx5yFzOD9P9zFLBIiZYD/tQTVEJr88JGSTJIlMf365LCBwj/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 194.138.21.72) smtp.rcpttodomain=googlegroups.com smtp.mailfrom=siemens.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=siemens.com; dkim=none (message not signed); arc=none Received: from DB8PR06CA0012.eurprd06.prod.outlook.com (2603:10a6:10:100::25) by DB8PR10MB2860.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:10:b1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5164.18; Wed, 13 Apr 2022 06:35:49 +0000 Received: from DB5EUR01FT057.eop-EUR01.prod.protection.outlook.com (2603:10a6:10:100:cafe::a1) by DB8PR06CA0012.outlook.office365.com (2603:10a6:10:100::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5164.18 via Frontend Transport; Wed, 13 Apr 2022 06:35:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 194.138.21.72) smtp.mailfrom=siemens.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=siemens.com; Received-SPF: Pass (protection.outlook.com: domain of siemens.com designates 194.138.21.72 as permitted sender) receiver=protection.outlook.com; client-ip=194.138.21.72; helo=hybrid.siemens.com; Received: from hybrid.siemens.com (194.138.21.72) by DB5EUR01FT057.mail.protection.outlook.com (10.152.5.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5164.19 via Frontend Transport; Wed, 13 Apr 2022 06:35:48 +0000 Received: from DEMCHDC8A0A.ad011.siemens.net (139.25.226.106) by DEMCHDC9SMA.ad011.siemens.net (194.138.21.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 13 Apr 2022 08:35:48 +0200 Received: from random.ppmd.siemens.net (139.25.68.25) by DEMCHDC8A0A.ad011.siemens.net (139.25.226.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 13 Apr 2022 08:35:48 +0200 From: Adriaan Schmidt To: CC: Adriaan Schmidt Subject: [PATCH 1/2] scripts: add isar-sstate Date: Wed, 13 Apr 2022 08:35:33 +0200 Message-ID: <20220413063534.799526-2-adriaan.schmidt@siemens.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220413063534.799526-1-adriaan.schmidt@siemens.com> References: <20220413063534.799526-1-adriaan.schmidt@siemens.com> MIME-Version: 1.0 X-Originating-IP: [139.25.68.25] X-ClientProxiedBy: DEMCHDC8A0A.ad011.siemens.net (139.25.226.106) To DEMCHDC8A0A.ad011.siemens.net (139.25.226.106) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 72ec5aff-377d-4ac9-11d0-08da1d17d911 X-MS-TrafficTypeDiagnostic: DB8PR10MB2860:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1slzspz7rj1f3mjfC/6qnIme65FRT9ejWy6grognPmj8IUG1MMsRmjh3PQdsXo5k4a/ycsZ9D+AdjkK3QZq574R6V1oh/Mr0wwh5KQgIFeseRPy22RdNwzOSqT+j8Tc4wbuSE83TXo97BrL0lvXRnxq+Opp+7QEjg7VIuRXUTYgPInusZNdEnycQZsVzRvdqxRDGirMVGygEnkOKJ51WgRig8DsQ/VAXw0NEp424CO4gQx+xPzxYh4w6q2Pmv51QHrYOwejm6KtRhEUMReN7b4iKS9ad1t3A2QoqI6CzY5nCifVneC+jzcert6sNX0S3OPKm4zNfCbZlgVzcp1Pf74qwlPbLKrVwExqRxDr0CUq1+OMy9xC7t5nWjWenvjjvIJhBZi5NZhfioB+YKSQoPwY6bFUT9M0GCK9UHebhp9klub2QRvjMUAKDgA/qxXQ1Hy/Yina4c9sKouVtkgqYynlUDmTrB6UsUShrncR5JjWU6JSpgjDKR0JZ/OYBH2BHjKqjp4d7amdpKlo69PcZeDUVvlA8CJSq0kON38RXdakl4gkpXhUIMljK5nQyYqsOgmoUqa+8rTbeeOQX9EmkjVmYCAUmFk+xyrpjMP+ZT1/FSWpoyTecaygETkGv8CSNjJRx1sT2AtI9ekukjzlxvHMAbF2Wcf3gwtCrzOxY2Ipjx3nwUqfxWCxnu57sS41JPArVadM0EPXhPBoIb4HvIw== X-Forefront-Antispam-Report: CIP:194.138.21.72; CTRY:DE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:hybrid.siemens.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230001)(4636009)(36840700001)(46966006)(40470700004)(1076003)(956004)(107886003)(2616005)(83380400001)(36860700001)(44832011)(36756003)(82960400001)(16526019)(186003)(2906002)(26005)(82310400005)(6916009)(5660300002)(40460700003)(6666004)(86362001)(81166007)(498600001)(30864003)(8936002)(336012)(4326008)(8676002)(47076005)(70586007)(966005)(70206006)(356005)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: siemens.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Apr 2022 06:35:48.8793 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 72ec5aff-377d-4ac9-11d0-08da1d17d911 X-MS-Exchange-CrossTenant-Id: 38ae3bcd-9579-4fd4-adda-b42e1495d55a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=38ae3bcd-9579-4fd4-adda-b42e1495d55a; Ip=[194.138.21.72]; Helo=[hybrid.siemens.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR01FT057.eop-EUR01.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR10MB2860 X-Original-Sender: adriaan.schmidt@siemens.com X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@siemens.com header.s=selector2 header.b=aNfX2hc6; arc=pass (i=1 spf=pass spfdomain=siemens.com dmarc=pass fromdomain=siemens.com); spf=pass (google.com: domain of adriaan.schmidt@siemens.com designates 2a01:111:f400:7d00::622 as permitted sender) smtp.mailfrom=adriaan.schmidt@siemens.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=siemens.com Precedence: list Mailing-list: list isar-users@googlegroups.com; contact isar-users+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: isar-users@googlegroups.com X-Google-Group-Id: 914930254986 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on shymkent.ilbers.de X-getmail-retrieved-from-mailbox: INBOX This adds a maintenance helper script to work with remote/shared sstate caches. Signed-off-by: Adriaan Schmidt --- scripts/isar-sstate | 743 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 743 insertions(+) create mode 100755 scripts/isar-sstate diff --git a/scripts/isar-sstate b/scripts/isar-sstate new file mode 100755 index 00000000..b1e2c1ec --- /dev/null +++ b/scripts/isar-sstate @@ -0,0 +1,743 @@ +#!/usr/bin/env python3 +""" +This software is part of Isar +Copyright (c) Siemens AG, 2022 + +# isar-sstate: Helper for management of shared sstate caches + +Isar uses the sstate cache feature of bitbake to cache the output of certain +build tasks, potentially speeding up builds significantly. This script is +meant to help managing shared sstate caches, speeding up builds using cache +artifacts created elsewhere. There are two main ways of accessing a shared +sstate cache: + - Point `SSTATE_DIR` to a persistent location that is used by multiple + builds. bitbake will read artifacts from there, and also immediately + store generated cache artifacts in this location. This speeds up local + builds, and if `SSTATE_DIR` is located on a shared filesystem, it can + also benefit others. + - Point `SSTATE_DIR` to a local directory (e.g., simply use the default + value `${TOPDIR}/sstate-cache`), and additionally set `SSTATE_MIRRORS` + to a remote sstate cache. bitbake will use artifacts from both locations, + but will write newly created artifacts only to the local folder + `SSTATE_DIR`. To share them, you need to explicitly upload them to + the shared location, which is what isar-sstate is for. + +isar-sstate implements four commands (upload, clean, info, analyze), +and supports three remote backends (filesystem, http/webdav, AWS S3). + +## Commands + +### upload + +The `upload` command pushes the contents of a local sstate cache to the +remote location, uploading all files that don't already exist on the remote. + +### clean + +The `clean` command deletes old artifacts from the remote cache. It takes two +arguments, `--max-age` and `--max-sig-age`, each of which must be a number, +followed by one of `w`, `d`, `h`, `m`, or `s` (for weeks, days, hours, minutes, +seconds, respectively). + +`--max-age` specifies up to which age artifacts should be kept in the cache. +Anything older will be removed. Note that this only applies to the `.tgz` files +containing the actual cached items, not the `.siginfo` files containing the +cache metadata (signatures and hashes). +To permit analysis of caching details using the `analyze` command, the siginfo +files can be kept longer, as indicated by `--max-sig-age`. If not set explicitly, +this defaults to `max_age`, and any explicitly given value can't be smaller +than `max_age`. + +### info + +The `info` command scans the remote cache and displays some basic statistics. +The argument `--verbose` increases the amount of information displayed. + +### analyze + +The `analyze` command iterates over all artifacts in the local sstate cache, +and compares them to the contents of the remote cache. If an item is not +present in the remote cache, the signature of the local item is compared +to all potential matches in the remote cache, identified by matching +architecture, recipe (`PN`), and task. This analysis has the same output +format as `bitbake-diffsigs`. + +## Backends + +### Filesystem backend + +This uses a filesystem location as the remote cache. In case you can access +your remote cache this way, you could also have bitbake write to the cache +directly, by setting `SSTATE_DIR`. However, using `isar-sstate` gives +you a uniform interface, and lets you use the same code/CI scripts across +heterogeneous setups. Also, it gives you the `analyze` command. + +### http backend + +A http server with webdav extension can be used as remote cache. +Apache can easily be configured to function as a remote sstate cache, e.g.: +``` + + Alias /sstate/ /path/to/sstate/location/ + + Dav on + Options Indexes + Require all granted + + +``` +In addition you need to load Apache's dav module: +``` +a2enmod dav +``` + +To use the http backend, you need to install the Python webdavclient library. +On Debian you would: +``` +apt-get install python3-webdavclient +``` + +### S3 backend + +An AWS S3 bucket can be used as remote cache. You need to ensure that AWS +credentials are present (e.g., in your AWS config file or as environment +variables). + +To use the S3 backend you need to install the Python botocore library. +On Debian you would: +``` +apt-get install python3-botocore +``` +""" + +import argparse +from collections import namedtuple +import datetime +import os +import re +import shutil +import sys +from tempfile import NamedTemporaryFile +import time + +sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), '..', 'bitbake', 'lib')) +analysis_supported = True +from bb.siggen import compare_sigfiles + +# runtime detection of supported targets +webdav_supported = True +try: + import webdav3.client + import webdav3.exceptions +except ModuleNotFoundError: + webdav_supported = False + +s3_supported = True +try: + import botocore.exceptions + import botocore.session +except ModuleNotFoundError: + s3_supported = False + +SstateCacheEntry = namedtuple( + 'SstateCacheEntry', 'hash path arch pn task suffix islink age size'.split()) + +# The filename of sstate items is defined in Isar: +# SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:" +# "${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:" + +# This regex extracts relevant fields: +SstateRegex = re.compile(r'sstate:(?P[^:]*):[^:]*:[^:]*:[^:]*:' + r'(?P[^:]*):[^:]*:(?P[0-9a-f]*)_' + r'(?P[^\.]*)\.(?P.*)') + + +class SstateTargetBase(object): + def __init__(self, path): + """Constructor + + :param path: URI of the remote (without leading 'protocol://') + """ + pass + + def __repr__(self): + """Format remote for printing + + :returns: URI string, including 'protocol://' + """ + pass + + def exists(self, path=''): + """Check if a remote path exists + + :param path: path (file or directory) to check + :returns: True if path exists, False otherwise + """ + pass + + def create(self): + """Try to create the remote + + :returns: True if remote could be created, False otherwise + """ + pass + + def mkdir(self, path): + """Create a directory on the remote + + :param path: path to create + :returns: True on success, False on failure + """ + pass + + def upload(self, path, filename): + """Uploads a local file to the remote + + :param path: remote path to upload to + :param filename: local file to upload + """ + pass + + def delete(self, path): + """Delete remote file and remove potential empty directories + + :param path: remote file to delete + """ + pass + + def list_all(self): + """List all sstate files in the remote + + :returns: list of SstateCacheEntry objects + """ + pass + + def download(self, path): + """Prepare to temporarily access a remote file for reading + + This is meant to provide access to siginfo files during analysis. Files + must not be modified, and should be released using release() once they + are no longer used. + + :param path: remote path + :returns: local path to file + """ + pass + + def release(self, download_path): + """Release a temporary file + + :param doenload_path: local file + """ + pass + + +class SstateFileTarget(SstateTargetBase): + def __init__(self, path): + if path.startswith('file://'): + path = path[len('file://'):] + self.path = path + self.basepath = os.path.abspath(path) + + def __repr__(self): + return f"file://{self.path}" + + def exists(self, path=''): + return os.path.exists(os.path.join(self.basepath, path)) + + def create(self): + return self.mkdir('') + + def mkdir(self, path): + try: + os.makedirs(os.path.join(self.basepath, path), exist_ok=True) + except OSError: + return False + return True + + def upload(self, path, filename): + shutil.copy(filename, os.path.join(self.basepath, path)) + + def delete(self, path): + try: + os.remove(os.path.join(self.basepath, path)) + except FileNotFoundError: + pass + dirs = path.split('/')[:-1] + for d in [dirs[:i] for i in range(len(dirs), 0, -1)]: + try: + os.rmdir(os.path.join(self.basepath, '/'.join(d))) + except FileNotFoundError: + pass + except OSError: # directory is not empty + break + + def list_all(self): + all_files = [] + now = time.time() + for subdir, dirs, files in os.walk(self.basepath): + reldir = subdir[(len(self.basepath)+1):] + for f in files: + m = SstateRegex.match(f) + if m is not None: + islink = os.path.islink(os.path.join(subdir, f)) + age = int(now - os.path.getmtime(os.path.join(subdir, f))) + all_files.append(SstateCacheEntry( + path=os.path.join(reldir, f), + size=os.path.getsize(os.path.join(subdir, f)), + islink=islink, + age=age, + **(m.groupdict()))) + return all_files + + def download(self, path): + # we don't actually download, but instead just pass the local path + if not self.exists(path): + return None + return os.path.join(self.basepath, path) + + def release(self, download_path): + # as we didn't download, there is nothing to clean up + pass + + +class SstateDavTarget(SstateTargetBase): + def __init__(self, url): + if not webdav_supported: + print("ERROR: No webdav support. Please install the webdav3 Python module.") + print("INFO: on Debian: 'apt-get install python3-webdavclient'") + sys.exit(1) + m = re.match('^([^:]+://[^/]+)/(.*)', url) + if not m: + print(f"Cannot parse target path: {url}") + sys.exit(1) + self.host = m.group(1) + self.basepath = m.group(2) + if not self.basepath.endswith('/'): + self.basepath += '/' + self.dav = webdav3.client.Client({'webdav_hostname': self.host}) + self.tmpfiles = [] + + def __repr__(self): + return f"{self.host}/{self.basepath}" + + def exists(self, path=''): + return self.dav.check(self.basepath + path) + + def create(self): + return self.mkdir('') + + def mkdir(self, path): + dirs = (self.basepath + path).split('/') + + for i in range(len(dirs)): + d = '/'.join(dirs[:(i+1)]) + '/' + if not self.dav.check(d): + if not self.dav.mkdir(d): + return False + return True + + def upload(self, path, filename): + return self.dav.upload_sync(remote_path=self.basepath + path, local_path=filename) + + def delete(self, path): + self.dav.clean(self.basepath + path) + dirs = path.split('/')[1:-1] + for d in [dirs[:i] for i in range(len(dirs), 0, -1)]: + items = self.dav.list(self.basepath + '/'.join(d), get_info=True) + if len(items) > 0: + # collection is not empty + break + self.dav.clean(self.basepath + '/'.join(d)) + + def list_all(self): + now = time.time() + + def recurse_dir(path): + files = [] + for item in self.dav.list(path, get_info=True): + if item['isdir'] and not item['path'] == path: + files.extend(recurse_dir(item['path'])) + elif not item['isdir']: + m = SstateRegex.match(item['path'][len(path):]) + if m is not None: + modified = time.mktime( + datetime.datetime.strptime( + item['created'], + '%Y-%m-%dT%H:%M:%SZ').timetuple()) + age = int(now - modified) + files.append(SstateCacheEntry( + path=item['path'][len(self.basepath):], + size=int(item['size']), + islink=False, + age=age, + **(m.groupdict()))) + return files + return recurse_dir(self.basepath) + + def download(self, path): + # download to a temporary file + tmp = NamedTemporaryFile(prefix='isar-sstate-', delete=False) + tmp.close() + try: + self.dav.download_sync(remote_path=self.basepath + path, local_path=tmp.name) + except webdav3.exceptions.RemoteResourceNotFound: + return None + self.tmpfiles.append(tmp.name) + return tmp.name + + def release(self, download_path): + # remove the temporary download + if download_path is not None and download_path in self.tmpfiles: + os.remove(download_path) + self.tmpfiles = [f for f in self.tmpfiles if not f == download_path] + + +class SstateS3Target(SstateTargetBase): + def __init__(self, path): + if not s3_supported: + print("ERROR: No S3 support. Please install the botocore Python module.") + print("INFO: on Debian: 'apt-get install python3-botocore'") + sys.exit(1) + session = botocore.session.get_session() + self.s3 = session.create_client('s3') + if path.startswith('s3://'): + path = path[len('s3://'):] + m = re.match('^([^/]+)(?:/(.+)?)?$', path) + self.bucket = m.group(1) + if m.group(2): + self.basepath = m.group(2) + if not self.basepath.endswith('/'): + self.basepath += '/' + else: + self.basepath = '' + self.tmpfiles = [] + + def __repr__(self): + return f"s3://{self.bucket}/{self.basepath}" + + def exists(self, path=''): + if path == '': + # check if the bucket exists + try: + self.s3.head_bucket(Bucket=self.bucket) + except botocore.exceptions.ClientError as e: + print(e) + print(e.response['Error']['Message']) + return False + return True + try: + self.s3.head_object(Bucket=self.bucket, Key=self.basepath + path) + except botocore.exceptions.ClientError as e: + if e.response['ResponseMetadata']['HTTPStatusCode'] != 404: + print(e) + print(e.response['Error']['Message']) + return False + return True + + def create(self): + return self.exists() + + def mkdir(self, path): + # in S3, folders are implicit and don't need to be created + return True + + def upload(self, path, filename): + try: + self.s3.put_object(Body=open(filename, 'rb'), Bucket=self.bucket, Key=self.basepath + path) + except botocore.exceptions.ClientError as e: + print(e) + print(e.response['Error']['Message']) + + def delete(self, path): + try: + self.s3.delete_object(Bucket=self.bucket, Key=self.basepath + path) + except botocore.exceptions.ClientError as e: + print(e) + print(e.response['Error']['Message']) + + def list_all(self): + now = time.time() + + def recurse_dir(path): + files = [] + try: + result = self.s3.list_objects(Bucket=self.bucket, Prefix=path, Delimiter='/') + except botocore.exceptions.ClientError as e: + print(e) + print(e.response['Error']['Message']) + return [] + for f in result.get('Contents', []): + m = SstateRegex.match(f['Key'][len(path):]) + if m is not None: + modified = time.mktime(f['LastModified'].timetuple()) + age = int(now - modified) + files.append(SstateCacheEntry( + path=f['Key'][len(self.basepath):], + size=f['Size'], + islink=False, + age=age, + **(m.groupdict()))) + for p in result.get('CommonPrefixes', []): + files.extend(recurse_dir(p['Prefix'])) + return files + return recurse_dir(self.basepath) + + def download(self, path): + # download to a temporary file + tmp = NamedTemporaryFile(prefix='isar-sstate-', delete=False) + try: + result = self.s3.get_object(Bucket=self.bucket, Key=self.basepath + path) + except botocore.exceptions.ClientError: + return None + tmp.write(result['Body'].read()) + tmp.close() + self.tmpfiles.append(tmp.name) + return tmp.name + + def release(self, download_path): + # remove the temporary download + if download_path is not None and download_path in self.tmpfiles: + os.remove(download_path) + self.tmpfiles = [f for f in self.tmpfiles if not f == download_path] + + +def arguments(): + parser = argparse.ArgumentParser() + parser.add_argument( + 'command', type=str, metavar='command', + choices='info upload clean analyze'.split(), + help="command to execute (info, upload, clean, analyze)") + parser.add_argument( + 'source', type=str, nargs='?', + help="local sstate dir (for uploads or analysis)") + parser.add_argument( + 'target', type=str, + help="remote sstate location (a file://, http://, or s3:// URI)") + parser.add_argument( + '-v', '--verbose', default=False, action='store_true') + parser.add_argument( + '--max-age', type=str, default='1d', + help="clean tgz files older than MAX_AGE (a number followed by w|d|h|m|s)") + parser.add_argument( + '--max-sig-age', type=str, default=None, + help="clean siginfo files older than MAX_SIG_AGE (defaults to MAX_AGE)") + + args = parser.parse_args() + if args.command in 'upload analyze'.split() and args.source is None: + print(f"ERROR: '{args.command}' needs a source and target") + sys.exit(1) + elif args.command in 'info clean'.split() and args.source is not None: + print(f"ERROR: '{args.command}' must not have a source (only a target)") + sys.exit(1) + return args + + +def sstate_upload(source, target, verbose, **kwargs): + if not os.path.isdir(source): + print(f"WARNING: source {source} does not exist. Not uploading.") + return 0 + + if not target.exists() and not target.create(): + print(f"ERROR: target {target} does not exist and could not be created.") + return -1 + + print(f"INFO: uploading {source} to {target}") + os.chdir(source) + upload, exists = [], [] + for subdir, dirs, files in os.walk('.'): + target_dirs = subdir.split('/')[1:] + for f in files: + file_path = (('/'.join(target_dirs) + '/') if len(target_dirs) > 0 else '') + f + if target.exists(file_path): + if verbose: + print(f"[EXISTS] {file_path}") + exists.append(file_path) + else: + upload.append((file_path, target_dirs)) + upload_gb = (sum([os.path.getsize(f[0]) for f in upload]) / 1024.0 / 1024.0 / 1024.0) + print(f"INFO: uploading {len(upload)} files ({upload_gb:.02f} GB)") + print(f"INFO: {len(exists)} files already present on target") + for file_path, target_dirs in upload: + if verbose: + print(f"[UPLOAD] {file_path}") + target.mkdir('/'.join(target_dirs)) + target.upload(file_path, file_path) + return 0 + + +def sstate_clean(target, max_age, max_sig_age, verbose, **kwargs): + def convert_to_seconds(x): + seconds_per_unit = {'s': 1, 'm': 60, 'h': 3600, 'd': 86400, 'w': 604800} + m = re.match(r'^(\d+)(w|d|h|m|s)?', x) + if m is None: + print(f"ERROR: cannot parse MAX_AGE '{max_age}', needs to be a number followed by w|d|h|m|s") + sys.exit(-1) + if (unit := m.group(2)) is None: + print("WARNING: MAX_AGE without unit, assuming 'days'") + unit = 'd' + return int(m.group(1)) * seconds_per_unit[unit] + + max_age_seconds = convert_to_seconds(max_age) + if max_sig_age is None: + max_sig_age_seconds = max_age_seconds + else: + max_sig_age_seconds = max(max_age_seconds, convert_to_seconds(max_sig_age)) + + if not target.exists(): + print(f"INFO: cannot access target {target}. Nothing to clean.") + return 0 + + print(f"INFO: scanning {target}") + all_files = target.list_all() + links = [f for f in all_files if f.islink] + if links: + print(f"NOTE: we have links: {links}") + tgz_files = [f for f in all_files if f.suffix == 'tgz'] + siginfo_files = [f for f in all_files if f.suffix == 'tgz.siginfo'] + del_tgz_files = [f for f in tgz_files if f.age >= max_age_seconds] + del_tgz_hashes = [f.hash for f in del_tgz_files] + del_siginfo_files = [f for f in siginfo_files if + f.age >= max_sig_age_seconds or f.hash in del_tgz_hashes] + print(f"INFO: found {len(tgz_files)} tgz files, {len(del_tgz_files)} of which are older than {max_age}") + print(f"INFO: found {len(siginfo_files)} siginfo files, {len(del_siginfo_files)} of which " + f"correspond to tgz files or are older than {max_sig_age}") + + for f in del_tgz_files + del_siginfo_files: + if verbose: + print(f"[DELETE] {f.path}") + target.delete(f.path) + freed_gb = sum([x.size for x in del_tgz_files + del_siginfo_files]) / 1024.0 / 1024.0 / 1024.0 + print(f"INFO: freed {freed_gb:.02f} GB") + return 0 + + +def sstate_info(target, verbose, **kwargs): + if not target.exists(): + print(f"INFO: cannot access target {target}. No info to show.") + return 0 + + print(f"INFO: scanning {target}") + all_files = target.list_all() + size_gb = sum([x.size for x in all_files]) / 1024.0 / 1024.0 / 1024.0 + print(f"INFO: found {len(all_files)} files ({size_gb:0.2f} GB)") + + if not verbose: + return 0 + + archs = list(set([f.arch for f in all_files])) + print(f"INFO: found the following archs: {archs}") + + key_task = {'deb': 'dpkg_build', + 'rootfs': 'rootfs_install', + 'bootstrap': 'bootstrap'} + recipes = {k: [] for k in key_task.keys()} + others = [] + for pn in set([f.pn for f in all_files]): + tasks = set([f.task for f in all_files if f.pn == pn]) + ks = [k for k, v in key_task.items() if v in tasks] + if len(ks) == 1: + recipes[ks[0]].append(pn) + elif len(ks) == 0: + others.append(pn) + else: + print(f"WARNING: {pn} could be any of {ks}") + for k, entries in recipes.items(): + print(f"Cache hits for {k}:") + for pn in entries: + hits = [f for f in all_files if f.pn == pn and f.task == key_task[k] and f.suffix == 'tgz'] + print(f" - {pn}: {len(hits)} hits") + print("Other cache hits:") + for pn in others: + print(f" - {pn}") + return 0 + + +def sstate_analyze(source, target, **kwargs): + if not os.path.isdir(source): + print(f"ERROR: source {source} does not exist. Nothing to analyze.") + return -1 + if not target.exists(): + print(f"ERROR: target {target} does not exist. Nothing to analyze.") + return -1 + + source = SstateFileTarget(source) + local_sigs = {s.hash: s for s in source.list_all() if s.suffix.endswith('.siginfo')} + remote_sigs = {s.hash: s for s in target.list_all() if s.suffix.endswith('.siginfo')} + + key_tasks = 'dpkg_build rootfs_install bootstrap'.split() + + check = [k for k, v in local_sigs.items() if v.task in key_tasks] + for local_hash in check: + s = local_sigs[local_hash] + print(f"\033[1;33m==== checking local item {s.arch}:{s.pn}:{s.task} ({s.hash[:8]}) ====\033[0m") + if local_hash in remote_sigs: + print(" -> found hit in remote cache") + continue + remote_matches = [k for k, v in remote_sigs.items() if s.arch == v.arch and s.pn == v.pn and s.task == v.task] + if len(remote_matches) == 0: + print(" -> found no hit, and no potential remote matches") + else: + print(f" -> found no hit, but {len(remote_matches)} potential remote matches") + for r in remote_matches: + t = remote_sigs[r] + print(f"\033[0;33m**** comparing to {r[:8]} ****\033[0m") + + def recursecb(key, remote_hash, local_hash): + recout = [] + if remote_hash in remote_sigs.keys(): + remote_file = target.download(remote_sigs[remote_hash].path) + elif remote_hash in local_sigs.keys(): + recout.append(f"found remote hash in local signatures ({key})!?! (please implement that case!)") + return recout + else: + recout.append(f"could not find remote signature {remote_hash[:8]} for job {key}") + return recout + if local_hash in local_sigs.keys(): + local_file = source.download(local_sigs[local_hash].path) + elif local_hash in remote_sigs.keys(): + local_file = target.download(remote_sigs[local_hash].path) + else: + recout.append(f"could not find local signature {local_hash[:8]} for job {key}") + return recout + if local_file is None or remote_file is None: + out = "Aborting analysis because siginfo files disappered unexpectedly" + else: + out = compare_sigfiles(remote_file, local_file, recursecb, color=True) + if local_hash in local_sigs.keys(): + source.release(local_file) + else: + target.release(local_file) + target.release(remote_file) + for change in out: + recout.extend([' ' + line for line in change.splitlines()]) + return recout + + local_file = source.download(s.path) + remote_file = target.download(t.path) + out = compare_sigfiles(remote_file, local_file, recursecb, color=True) + source.release(local_file) + target.release(remote_file) + # shorten hashes from 64 to 8 characters for better readability + out = [re.sub(r'([0-9a-f]{8})[0-9a-f]{56}', r'\1', line) for line in out] + print('\n'.join(out)) + + +def main(): + args = arguments() + + if args.target.startswith('http://'): + target = SstateDavTarget(args.target) + elif args.target.startswith('s3://'): + target = SstateS3Target(args.target) + elif args.target.startswith('file://'): + target = SstateFileTarget(args.target) + else: # no protocol given, assume file:// + target = SstateFileTarget(args.target) + + args.target = target + return globals()[f'sstate_{args.command}'](**vars(args)) + + +if __name__ == '__main__': + sys.exit(main())