Fresh soft for your Amazon AMI. Part 2. Publishing your own work.


Posted:   |  More posts about linux amazon linux yum   |   Source

This is the second and last part of the series "Fresh soft for your Amazon AMI". In this post aim to explain how you can re-build srpm package to a new version using mock and git, and how you can publish your own yum repository.

Feel free to address Part 1. Stealing from Fedora, where I describe how you can get the source package from Fedora and rebuild it.

The rpm package structure

If you extract source rpm package, you will find out that it consists of software source and of a number of patches and auxiliary files. The most important among them is the spec-file (usually named like package-name.spec). The spec file defines all the steps and all the variables, required to turn the this rather disorderly collection to binary rpm package. This is how, for instance, the contents of redis package from EPEL repository looks like:

  • redis-2.4.10.tar.gz: main source tarball
  • redis-2.4.8-redis.conf.patch: patch for redis.conf configuration file
  • redis.init: init-script to start up the redis server
  • redis.logrotate: logrotate configuration file
  • redis.spec: rpm build specification

Rebuild rpm from spec-file. How to use mock with git

Even in Fedora Rawhide the virtualenv is quite outdated (1.7, whereas 1.8 is available for long). Let's try to master the "industrial process" of building rpm packages.

Note

I assume that you know and like git system. If it's not the case, you can easily come into play with the marvelous git book.

First, let's try to re-build the package with no changes, just to get the idea.

Download the original package, which we want to update. I don't enable Rawhide here, as I want to upgrade original source, but modifying Rawhide source can be easier sometimes.

I created the directory python-virtualenv-rpm previously

$ yumdownloader --source python-virtualenv
$ cd python-virtualenv-rpm
$ rpm2cpio ../python-virtualenv-*.src.rpm | cpio -idmv

This package is very simple, it contains only the source and the spec. Then turn the python-virtualenv-rpm directory to a git repository

$ git init
$ git add -A .
$ git commit -m "Initial release"

Now let's build the package from the repo we've got.

$ mock -v --scm-enable --scm-option method=git \
                       --scm-option git_get="git clone /home/mockbuild/python-virtualenv-rpm SCM_PKG" \
                       --scm-option package=python-virtualenv \
                       --scm-option spec=SCM_PKG.spec

Here SCM_PKG is the special "variable" which is replaced with the package name (python-virtualenv in this case).

This command can be interpreted as "Build the package python-virtualenv with mock and git. To get the package spec and sources, use command defined in git_get, then find the spec file python-virtualenv.spec there, from this spec build the srpm package, and then build the the final rpm".

Logs and results will be available in /var/lib/mock/epel-6-x86_64/result/.

Some notes before starting changing the spec file.

  • As you can see, it's trivial to publish package on github or bitbucket instead of storing it locally. The workflow won't be changed at all.
  • Storing big archives in the git is not a very good idea, it's better to save the in separate location. Mock knows how to handle this cases: just put all sources of all packages you build in a separate directory, and define it in configuration with scm-option ext_src_dir=/path/to/your/directory.
  • Most of the scm options (like "method", "spec", "ext_src_dir" and even "git_get") can be written to the /etc/mock/site-defaults.cfg. See this file for more examples.

Rebuild rpm from spec-file. Update spec file

As you successfully rebuilt your package, modify the spec file. In the best case scenario the only option you will need to fix is the version field, and optionally, the changelog.

Additionally, you may have to add add some files (like documentation) to the specification, remove or rename them.

In the python-virtualenv package I modified the version and the changelog and tried to rebuild. Then it turned out that some files were added in a new package, and some were removed from it. Namely:

  • new version of the virtualenv didn't contain the file HACKING, and I removed this file from the spec as well.
  • new version of the virtualenv had the executable file virtualenv-2.6, and I added this file to the specification.

This is how the diff looked like

--- python-virtualenv.spec.orig     2013-01-07 14:53:06.089499165 +0000
+++ python-virtualenv.spec  2013-01-07 15:23:54.120211669 +0000
@@ -2,7 +2,7 @@
 %{!?python_sitelib: %define python_sitelib %(%{__python} -c "from distutils.sysconfig import get_python_lib; print get_python_lib()")}

 Name:           python-virtualenv
-Version:        1.7
+Version:        1.8.4
 Release:        1%{?dist}
 Summary:        Tool to create isolated Python environments

@@ -58,7 +58,7 @@

 %files
 %defattr(-,root,root,-)
-%doc docs/*txt PKG-INFO AUTHORS.txt LICENSE.txt HACKING
+%doc docs/*txt PKG-INFO AUTHORS.txt LICENSE.txt
 # Include sphinx docs on Fedora
 %if 0%{?fedora} > 0
 %doc build/sphinx/*
@@ -66,9 +66,13 @@
 # For noarch packages: sitelib
 %{python_sitelib}/*
 %attr(755,root,root) %{_bindir}/virtualenv
+%attr(755,root,root) %{_bindir}/virtualenv-2.6


 %changelog
+* Mon Jan 07 2013 Roman Imankulov <roman.imankuklov@gmail.com> - 1.8.4-1
+- Local rebuild
+
 * Tue Dec 20 2011 Steve 'Ashcrow' Milner <me@stevemilner.org> - 1.7-1
 - Update for https://bugzilla.redhat.com/show_bug.cgi?id=769067

Once you made the changes, commit and push (if you use remote repository) your changes and rebuild the package.

I won't dive deep in the problematic of spec file syntax. Actually, I'm not quite familiar with it either. But the simple "change and try" loop has always worked well for me so far.

Publish rpm packages to a yum repository

As we work with Amazon, it looks like a good option to store data in S3. We'll create a local cache of S3, and then will sync data with an every new package built.

I store the rpm cache directly in the /home/mockbuild/repo. As I work with 64-bit platforms exclusively, all my binary packages are either "noarch.rpm" or "x86_64.rpm".

$ mkdir -p ~/repo/{SRPMS,x86_64}
$ mv /var/lib/mock/epel-6-x86_64/result/*.src.rpm repo/SRPMS
$ mv /var/lib/mock/epel-6-x86_64/result/*.x86_64.rpm repo/x86_64
$ mv /var/lib/mock/epel-6-x86_64/result/*.noarch.rpm repo/x86_64  # .noarch goes to x86_64 too
$ ls -d repo/* | xargs -i createrepo {}

The find ./repo shows you how your repository structure looks like.

To push data remotely, I use s3cmd utility.

# yum install s3cmd

At the first launch, initialize s3cmd configuration and create the S3 bucket where you plan to store your data. Here I created the S3 store in the "eu-west-1" region.

$ s3cmd --configure
$ s3cmd mb s3://<my-bucket> --bucket-location=EU

Once S3 bucket is created, you can push your data there (here repo is the directory name, mind trailing slashes, they make difference in s3cmd.)

$ s3cmd sync --delete-removed repo s3://<my-bucket>/

Make the repository available for your yum clients

By default Amazon S3 repositories aren't available from the Web. To fix it, you need to grant access to it, optionally restricting the repository with IP-based rules.

This is how you can make your S3 bucket publicly available:

{
   "Version": "2008-10-17",
   "Statement": [
       {
           "Sid": "AddPerm",
           "Effect": "Allow",
           "Principal": {
               "AWS": "*"
           },
           "Action": "s3:GetObject",
           "Resource": "arn:aws:s3:::<my-bucket>/*"
       }
   ]
}

And below is the example of IP-based restriction:

{
     "Version": "2008-10-17",
     "Statement": [
             {
                     "Sid": "AddPerm",
                     "Effect": "Allow",
                     "Principal": {
                             "AWS": "*"
                     },
                     "Action": "s3:GetObject",
                     "Resource": "arn:aws:s3:::<my-bucket>/*",
                     "Condition": {
                             "IpAddress": {
                                     "aws:SourceIp": "192.168.143.0/24"
                             }
                     }
             }
     ]
}

A configuration like this must be added via AWS Management Console, "Edit bucket policy" button of your bucket.

The string value of "SourceIp" you may replace with the list of strings (["1.2.3.4", "5.6.7.8"]) to allow more than one IP address or subnet to get access to your data.

Installing package from yum repository

Create the file /etc/yum.repos.d/local.repo with the following content

[local]
name=Local builds
baseurl=https://s3-eu-west-1.amazonaws.com/<my-bucket>/repo/$basearch/
gpgcheck=0
enabled=0

The baseurl points to the web-address of the subdirectory repo of your S3 bucket. Note that the repository is disabled by default (you may turn it on, though).

Compare two outputs to ensure it works.

# yum info python-pip
...
Version     : 0.8
...

# yum info --enablerepo=local python-pip
...
Version     : 1.2.1
...

My own github repository

You may want to take a look on my repository at https://github.com/imankulov/rpm-packages. It's somewhat different from what I promoted before, and I'd like to explain the differences.

  1. Instead of creating tons of repos for every package, I created only one, and I separated my sources and spec files for clarity
  2. Nonetheless, there is no clear distinction between packages here, but as there is not so many files there (yet), I feel comfortable with it.
  3. My git-get command looks slightly more sophisticated. It's because I store sources in a separate directory. For the same reason I was forced to define git_timestamps to False.
  4. There is a nice little script helpers/download_sources.py which I'm proud of. All it does is it reads the list of spec files and downloads sources to the /tmp/sources directory. Very handy.
  5. There is yet another helper helpers/sync_to_s3.sh which I use to synchronize my local and remote yum repository after I have built a new package. Name of the bucket should be defined in the YUM_S3_BUCKET environment variable.

Final notes

Why doing the same job again, if somebody has already made what you want? Before rebuilding your package, try to find out, maybe someone has already built your package and published the spec-file. Try to search "yourpackage rpm" on github first, for instance.

Feel free to leave URLs of your git repositories with spec files or yum repos in comment. No matter how small they are or how tiny problem they solve. The only requirement: they must be Amazon Linux AMI or CentOS compatible.

Comments powered by Disqus
Contents © 2013 Roman Imankulov - Powered by Nikola