Discussion:
PDF to PDF (gs?): rich RGB black to plain K (CMYK) black?
(too old to reply)
sdaau
2011-06-06 05:49:08 UTC
Permalink
Hi all,

I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.

So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
into a CMYK pdf where the same color is plain black (C:0, M:0, Y:0, K:
100)? I posted a similar question on

http://stackoverflow.com/questions/6241282/converting-pdf-to-cmyk-with-identify-recognizing-cmyk

... although that question is more Latex oriented. So here, I'll try
to provide my OpenOffice test case:

* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click File/Export as PDF; call this PDF blah-slide.pdf

At this point, close and reopen OpenOffice, for yet another slide
pdf:
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click Insert/Picture/From File... and insert whatever PNG image
** I used `convert -size 10x10 xc:red img.png` to generate a PNG image
to insert
* Click File/Export as PDF; call this PDF blah-slideP.pdf

At this point, we can run ImageMagick's `identlfy` on both pdf's, and
we'll get:

$ identify -verbose blah-slide.pdf | grep -i 'type\|color'
Type: Grayscale
Base type: Grayscale
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

$ identify -verbose blah-slideP.pdf | grep -i 'type\|color'
Type: TrueColor
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

Now, I'm aware that `identify` in principle works on raster images,
but I cannot find any other application that will provide similar
color information for PDFs (any other suggestions?)

Furthermore, the only check I have for CMYK separations for now (any
other suggestions?), is to use the `tiffsep` device of GhostScript:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide.pdf && eog p00000001.tif

(or)

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP.pdf && eog
p00000001.tif

Of course, both of these show that the black color of the text is
'rich' black - on all four CMYK plates - instead of a plain 'black',
just in the K channel...

//////


So, now I finally try the command line I found in
http://www.productionmonkeys.net/guides/ghostscript/examples for
converting, as it says, "Color PDF to CMYK" - for both of these PDFs
(without and with an embedded image):

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slide-out.pdf blah-slide.pdf

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slideP-out.pdf blah-slideP.pdf


.. And here is now the interesting thing - if I try to run `identify`
again - *only* the pdf containing an image is the one recognized as
CMYK:

$ identify -verbose blah-slide-out.pdf | grep -i 'type\|color'
Type: Palette
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

$ identify -verbose blah-slideP-out.pdf | grep -i 'type\|color'
Type: ColorSeparation
Base type: ColorSeparation
Colorspace: CMYK
Background color: white
Border color: cmyk(223,223,223,0)
Matte color: grey74
Transparent color: black


However, regardless of how they are reported, if I try to view their
separations:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
p00000001.tif

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
p00000001.tif

... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.


So, I guess my questions can be summed up as:

* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
* Why do I need an image in the slide, so that `identify` recognizes
the "converted" CMYK pdf as being really CMYK?

(* Are there any other alternative free tools for: conversion of RGB
to CMYK pdf; and: checking the print separations of any PDF?)

As a final note: I guess this kind of thing may have something to do
(and be achievable) with ICC profiles, which unfortunately I don't
understand very much - and I've had a lot of problems finding example
command lines; so if there is such a solution, an example command line
will be much appreciated.


Thanks in advance for any responses,
Cheers!
Matti Vuori
2011-06-06 08:41:21 UTC
Permalink
Post by sdaau
I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.
So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
100)?
I don't know, but the way I see it, the real problem here is your
incompetent printer, who should be able to do it as a matter of routine.
Helge Blischke
2011-06-06 09:26:43 UTC
Permalink
Post by sdaau
Hi all,
I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.
So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
100)? I posted a similar question on
http://stackoverflow.com/questions/6241282/converting-pdf-to-cmyk-with-
identify-recognizing-cmyk
Post by sdaau
... although that question is more Latex oriented. So here, I'll try
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click File/Export as PDF; call this PDF blah-slide.pdf
At this point, close and reopen OpenOffice, for yet another slide
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click Insert/Picture/From File... and insert whatever PNG image
** I used `convert -size 10x10 xc:red img.png` to generate a PNG image
to insert
* Click File/Export as PDF; call this PDF blah-slideP.pdf
At this point, we can run ImageMagick's `identlfy` on both pdf's, and
$ identify -verbose blah-slide.pdf | grep -i 'type\|color'
Type: Grayscale
Base type: Grayscale
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
$ identify -verbose blah-slideP.pdf | grep -i 'type\|color'
Type: TrueColor
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
Now, I'm aware that `identify` in principle works on raster images,
but I cannot find any other application that will provide similar
color information for PDFs (any other suggestions?)
Furthermore, the only check I have for CMYK separations for now (any
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide.pdf && eog p00000001.tif
(or)
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP.pdf && eog
p00000001.tif
Of course, both of these show that the black color of the text is
'rich' black - on all four CMYK plates - instead of a plain 'black',
just in the K channel...
//////
So, now I finally try the command line I found in
http://www.productionmonkeys.net/guides/ghostscript/examples for
converting, as it says, "Color PDF to CMYK" - for both of these PDFs
$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slide-out.pdf blah-slide.pdf
$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slideP-out.pdf blah-slideP.pdf
.. And here is now the interesting thing - if I try to run `identify`
again - *only* the pdf containing an image is the one recognized as
$ identify -verbose blah-slide-out.pdf | grep -i 'type\|color'
Type: Palette
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
$ identify -verbose blah-slideP-out.pdf | grep -i 'type\|color'
Type: ColorSeparation
Base type: ColorSeparation
Colorspace: CMYK
Background color: white
Border color: cmyk(223,223,223,0)
Matte color: grey74
Transparent color: black
However, regardless of how they are reported, if I try to view their
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
p00000001.tif
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
p00000001.tif
... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.
* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
* Why do I need an image in the slide, so that `identify` recognizes
the "converted" CMYK pdf as being really CMYK?
(* Are there any other alternative free tools for: conversion of RGB
to CMYK pdf; and: checking the print separations of any PDF?)
As a final note: I guess this kind of thing may have something to do
(and be achievable) with ICC profiles, which unfortunately I don't
understand very much - and I've had a lot of problems finding example
command lines; so if there is such a solution, an example command line
will be much appreciated.
Thanks in advance for any responses,
Cheers!
I did the following with PDFs generated by both LibreOffice and LaTex:
pdftops source.pdf test.ps | grep -i cs | grep Device
and the result was in both cases like

/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS

As the pdftops utility (from the xpdf suite) preserves the PDF color spaces,
this means that - at least the text - is *not* "rich black".

I rather suspect that your print provider uses some unusual color conversion
in his workflow.

Helge
sdaau
2011-06-06 11:29:45 UTC
Permalink
Hi all,

Thanks a lot for the prompt answers!
[snip]
Post by sdaau
So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
100)?
I don't know, but the way I see it, the real problem here is your
incompetent printer, who should be able to do it as a matter of routine.
Hehe :) It could well be - then again, most of these guys I worked
with (and I work with print shops on and off) simply invest a lot of
money in equipment; and when something like this comes up, their usual
response is: "just drop your file through the distiller once more",
and it gets very difficult to explain that I don't use "the
distiller" :) So I'd rather know how to give them files they won't
complain about :)
Post by sdaau
[snip]
pdftops source.pdf test.ps | grep -i cs | grep Device
and the result was in both cases like
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
As the pdftops utility (from the xpdf suite) preserves the PDF color spaces,
this means that - at least the text - is *not* "rich black".
I rather suspect that your print provider uses some unusual color conversion
in his workflow.
It could be - but then, I'm still having the same problem, even with
pdftops:

$ pdftops blah-slide.pdf blah-slide.ps
$ grep -A 1 Device blah-slide.ps
/DeviceGray {} cs
[0] sc
/DeviceGray {} CS
[0] SC
--
/DeviceRGB {} cs
[1 1 1] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc

# final check for separations
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide.ps && eog p00000001.tif


All the tiff separations show again text on all (CMYK) channels; and
seemingly, at least the background white color seems to be treated as
RGB.


I also tried converting the PDF to grayscale first:

* as per: http://handyfloss.net/2008.09/making-a-pdf-grayscale-with-ghostscript/

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray.pdf blah-
slide.pdf


* as per: color PDF -> Grayscale PDF - Ubuntu Forums -
http://ubuntuforums.org/showthread.php?t=379013

$ pdf2ps -sDEVICE=psgray blah-slide.pdf blah-slide-gray.ps


..., and then back to CMYK:

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sProcessColorModel=DeviceCMYK -sColorConversionStrategy=CMYK -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray-out.pdf blah-
slide-gray.pdf

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sProcessColorModel=DeviceCMYK -sColorConversionStrategy=CMYK -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray-ps-out.pdf blah-
slide-gray.ps


... and if I check tiff separations again:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-gray-out.pdf && eog
p00000001.tif

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-gray-ps-out.pdf && eog
p00000001.tif


... again the text black shows on all four separation tiffs :(


At this point, I'm wandering if the gs `tiffsep` is an appropriate
method for preview separation at all (though, if there are images
present, it seems to parse their CMYK separations OK)... But, it
seems, there is still no reliable method to get (originally RGB) black
color to show only in K channel?

Well, any further pointers on this will be much appreciated :)

Thanks,
Cheers!
ken
2011-06-06 13:05:34 UTC
Permalink
In article <208ecf3d-3629-4672-9a3e-
Post by sdaau
I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.
RGB->CMYK conversion often results in a mixture of CMY as well as black.
OpenOffice being a disply-oriented application (like Micrsofot Office)
probably only sets colours in RGB.

One thing you could try is printing to a PostScript file and converting
that into PDF as a separate step. You haven't said which OS you are
using, though I'm assuming some flavour of Linux. However its often the
case that PostScript printer drivers understand about CMYK and will
convert RGB into sensible colours.
Post by sdaau
However, regardless of how they are reported, if I try to view their
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
p00000001.tif
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
p00000001.tif
... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.
You are still converting the RGB into CMYK, if the
undercolorremoval/blackgeneration doesn't convert equal values of RGB
into CMYK, then you get a CMY output. It doesn't really matter which PDF
interpreter does this.
Post by sdaau
* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
There are a number of things you could try, but I would suggest either
printing to PostScript, and then using GS or sending the PDF file to
Ghostscript. Because GS is a PostScript interpreter, there are things
which can be done to colours.

It is possible to redefine the setcolor and setrgbcolor operators so
that they convert equal amounts of RGB into a colour specification in
DeviceGray instead (DeviceGray will convert to pure black in a CMYK
workflow).

Its also possible to set up an under colour removal function which
significantly affects how RGB is converted to CMYK (this is covered in
the PostScript Language Reference Manual).


If you can post a (small!) example file, preferably a single page, to
some publicly accessible URL I could take a look.



Ken
sdaau
2011-06-06 15:34:12 UTC
Permalink
Hi Ken,

Thanks for the response!
Post by ken
Post by sdaau
I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.
RGB->CMYK conversion often results in a mixture of CMY as well as black.
OpenOffice being a disply-oriented application (like Micrsofot Office)
probably only sets colours in RGB.
Yeah, that was my suspicion too - thanks for confirming!
Post by ken
One thing you could try is printing to a PostScript file and converting
that into PDF as a separate step. You haven't said which OS you are
using, though I'm assuming some flavour of Linux. However its often the
case that PostScript printer drivers understand about CMYK and will
convert RGB into sensible colours.
Yup, it's Ubuntu Linux I'm using - and yes, I'm seeing the advice
about PostScript as an intermediate step more and more, as in:

How to convert pdf to monochrome?... - http://www.groupsrv.com/computers/about669835.html
Post by ken
Post by sdaau
Print the original .pdf to PostScript in a file, edit the PostScript,
save [get then put/def] the existing PostScript definition (probably
builtin) of any operator that sets color (setcolor, setcmycolor,
setrgbcolor, setgray, ...), then install a new definition that does
whatever you want based on the actual arguments and the saved
original definitions.
... unfortunately, I do not understand the postscript language enough
to understand this advice :)

However, I've made some progress by manually hacking a postscript
file, which I'm hoping to post about next...
Post by ken
Post by sdaau
However, [snip]
... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.
You are still converting the RGB into CMYK, if the
undercolorremoval/blackgeneration doesn't convert equal values of RGB
into CMYK, then you get a CMY output. It doesn't really matter which PDF
interpreter does this.
Yes - but I was hoping, that if I 'properly' use color profiles
(whatever 'properly' is), I could sort of have this conversion go "the
right way" in this case: i.e. if it encounters R = G = B (grayscale);
then treat it as K:100 CMY:0...

I found:
http://git.ghostscript.com/?p=ghostpdl.git;a=blob_plain;f=gs/doc/GS9_Color_Management.pdf;hb=acdf790792b31d1581a4ae6eb8926128f4876214

and there it talks of DefaultRGB/CMYK/GrayProfile - and additionally -
sOutputICCProfile (and others); so I came to 'monster' command lines
like these:

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite \
-sICCProfilesDir=/usr/share/ghostscript/9.02/iccprofiles/ -
dUseCIEColor \
-sDefaultGrayProfile=default_gray.icc \
-sDefaultRGBProfile=default_rgb.icc -sProcessColorModel=DeviceGray \
-sColorConversionStrategy=Gray -sOutputICCProfile=default_cmyk.icc \
-dCompatibilityLevel=1.4 -sOutputFile=blah-test.pdf blah-slide.pdf &&
gs \
-sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -dLastPage=1
\
-sOutputFile=p%08d.tif blah-test.pdf && eog p00000001.tif

... just to see if some combo would work, but unfortunately not :)
Post by ken
Post by sdaau
* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
There are a number of things you could try, but I would suggest either
printing to PostScript,
Ah - you actually meant something like choosing a .ps output (or
"printing" to a .ps file) directly from OpenOffice? Yeah, that sounds
like it should save a processing step...
Post by ken
and then using GS or sending the PDF file to
Ghostscript. Because GS is a PostScript interpreter, there are things
which can be done to colours.
It is possible to redefine the setcolor and setrgbcolor operators so
that they convert equal amounts of RGB into a colour specification in
DeviceGray instead (DeviceGray will convert to pure black in a CMYK
workflow).
Its also possible to set up an under colour removal function which
significantly affects how RGB is converted to CMYK (this is covered in
the PostScript Language Reference Manual).
Thanks for noting this - I was somewhat aware that postscript language
can also "process", but I am completely ignorant about its scope. Re:
the DeviceGray CMYK workflow, I was intuitively trying to follow that
in the above "monster" cmdline too (as in: force all colors to
grayscale during conversion [in conversion colorspace], and write out
CMYK values based on these grayscale ones, which hopefully end up only
on the K plate) -- but I couldn't get `tiffsep` to confirm that.
Post by ken
If you can post a (small!) example file, preferably a single page, to
some publicly accessible URL I could take a look.
Sure, here are the pdf's of the slides mentioned in the OP:

http://sdaaubckp.sf.net/post/img/blah-slide.pdf
http://sdaaubckp.sf.net/post/img/blah-slideP.pdf

Many thanks for looking into this, :)
Cheers!
ken
2011-06-06 16:28:51 UTC
Permalink
In article <6d0f8a6c-3430-4b2f-8852-
Post by sdaau
How to convert pdf to monochrome?... - http://www.groupsrv.com/computers/about669835.html
Post by ken
Post by sdaau
Print the original .pdf to PostScript in a file, edit the PostScript,
save [get then put/def] the existing PostScript definition (probably
builtin) of any operator that sets color (setcolor, setcmycolor,
setrgbcolor, setgray, ...), then install a new definition that does
whatever you want based on the actual arguments and the saved
original definitions.
... unfortunately, I do not understand the postscript language enough
to understand this advice :)
I can do that for you, as can others here, but it would be helpful to
see an example. Ideally a PDF and PostScript file of a single page file,
passed through your workflow.

That was we can be more certain about what to do, and give you better
advice on how to achieve waht you want.
Post by sdaau
Post by ken
You are still converting the RGB into CMYK, if the
undercolorremoval/blackgeneration doesn't convert equal values of RGB
into CMYK, then you get a CMY output. It doesn't really matter which PDF
interpreter does this.
Yes - but I was hoping, that if I 'properly' use color profiles
(whatever 'properly' is), I could sort of have this conversion go "the
right way" in this case: i.e. if it encounters R = G = B (grayscale);
then treat it as K:100 CMY:0...
You only get CMYK output if you have an interpreter which applies the
ICC (via a Colour Management System) profile to create CMYK. In general
you won't get this.

What usually happens is that you get a PDF which contains colours in an
ICCBased colour space. Which your print shop probably won't like either.

Or possibly the colours still specified in RGB, but an OutputProfile
attached, which simply describes the RGB space for which these were
intended. A fully ICC ompliant workflow (ie including your printer)
would be able to create a link from the ICC profile in the PDF to the
ICC profile used for the printing device, and everything would magically
work out. This is rare, it usually only works on closed workflows (that
is, not accepting submissions from the outside world)
Post by sdaau
Post by ken
Post by sdaau
* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
There are a number of things you could try, but I would suggest either
printing to PostScript,
Ah - you actually meant something like choosing a .ps output (or
"printing" to a .ps file) directly from OpenOffice? Yeah, that sounds
like it should save a processing step...
Welk, I meant you could do that to get a PostScript file, which it might
be easier to massage intot he form you want before converting it into
PDF (assuming that's what your print shop wants as a submission).
Post by sdaau
Post by ken
Its also possible to set up an under colour removal function which
significantly affects how RGB is converted to CMYK (this is covered in
the PostScript Language Reference Manual).
Thanks for noting this - I was somewhat aware that postscript language
can also "process", but I am completely ignorant about its scope.
PostScript is a Turing-complete programming language. While there are
thigns that are hard to do, very little is impossible.
Post by sdaau
Post by ken
If you can post a (small!) example file, preferably a single page, to
some publicly accessible URL I could take a look.
http://sdaaubckp.sf.net/post/img/blah-slide.pdf
http://sdaaubckp.sf.net/post/img/blah-slideP.pdf
Many thanks for looking into this, :)
I'll go pull the files down now.


Ken
ken
2011-06-07 09:21:06 UTC
Permalink
In article <6d0f8a6c-3430-4b2f-8852-
Post by sdaau
Post by ken
Its also possible to set up an under colour removal function which
significantly affects how RGB is converted to CMYK (this is covered in
the PostScript Language Reference Manual).
Thanks for noting this - I was somewhat aware that postscript language
the DeviceGray CMYK workflow, I was intuitively trying to follow that
in the above "monster" cmdline too (as in: force all colors to
grayscale during conversion [in conversion colorspace], and write out
CMYK values based on these grayscale ones, which hopefully end up only
on the K plate) -- but I couldn't get `tiffsep` to confirm that.
Post by ken
If you can post a (small!) example file, preferably a single page, to
some publicly accessible URL I could take a look.
http://sdaaubckp.sf.net/post/img/blah-slide.pdf
http://sdaaubckp.sf.net/post/img/blah-slideP.pdf
I wasn't able to do this conversion in a single pass using the
Ghostscript PDF interpreter, because it uses the setrgbcolor directly
from systemdict, so it doesn't allow for replacement.

Instead I first converted your files to PostScript using the ps2write
device:

gs -sDEVICE=ps2write -sOutputFile=./out.ps ./blah-slide.pdf

Then I created a simple replaement routine, and stored it in a file
called HackRGB.ps:

%!
/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {
(in replacement setrgbcolor\n) print
%% R G B
1 index 1 index %% R G B G B
eq { %%
2 index 1 index %% R G B R B
eq {
%% Here if R = G = B
pop pop %% remove two values
setgray
} {
oldsetrgbcolor %% set the RGB values
} ifelse
}{
oldsetrgbcolor %% Set the RGB values
}ifelse
} bind def

This replaces the setrgbcolor operator with a routine which tests the
RGB value and if all components are equal it replaces it with a call to
setgray using just one of the components. (BTW you can remove the line
ending in 'print', its just there so that you can see something is
happening ;-)

I then converted the PostScript file back to PDF, but using this code:

gs -sDEVICE=pdfwrite -sOutputFile=./out.pdf ./HackRGB.ps ./out.ps

This results in a PDF file where the text is in a shade of gray. This
*ought* to be acceptable to your print shop, because gray should map
straight to the K channel of CMYK.

If for some reason that isn't acceptable, you could replace the
'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK directly.

NOTE! This only affects linework (text, vectors), only affects linework
using RGB and will only ocnvert that to gray if the R, G and B values
are identical. Images, shadings and potentially other object types will
not be affected.

I should also mention that going from PDF to PostScript and back to PDF
is a potentially lossy process which can introduce errors and odd
artefacts, you should check files carefully after this conversion. I
haven't tested this code particularly.

If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.


Ken
sdaau
2011-06-09 12:28:58 UTC
Permalink
Hi all,

Many, many thanks for the assistance with this problem! I believe it
is more or less solved now - somewhat of a mammoth post follows, but
first a summary:

* forceblack.ps uses `pdftops` PS file, manipulates /DeviceRGB, /
setcolorspace
* HackRGB.ps uses `gs` ps2write PS file, manipulates /setrgbcolor, /
setgray
Post by Helge Blischke
If you convert your OOo generated PDFs to PostScript using
pdftops (from the xpdf suite) and then prepend the attached
forceblack.ps to the resulting PostScript file, RGB colors where
R==G==B will be printed as pure black (replaced with the
appropriate gray value).
Note that this trick won't work with PDF input or PostScript
produced using Ghostscript's ps2write device.
Thanks for that, Helge Blischke; here is a command line log of what I
tried:


pdftops blah-slide.pdf blah-slide-tops.ps

cat forceblack.ps blah-slide-tops.ps > blah-slide-forceblack.ps
# blah-slide-forceblack.ps has wrong page size in evince!

# check if /pdfEndPage occurs only once:
sed -n '/\/pdfE/{p}' blah-slide.ps

# insert forceblack.ps after line where /pdfEndPage occurs:
## sed without -n will output entire input file
## the 'r' command reads in forceblack.ps, and
## adds/inserts it after the matching line
## http://www.grymoire.com/Unix/Sed.html#uh-0

sed '/\/pdfE/r forceblack.ps' blah-slide-tops.ps > blah-slide-
forceblack.ps

# now blah-slide-forceblack.ps is the correct page size!

# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-forceblack.ps && eog
p01.tif 2>/dev/null


... and again, `gs` with `tiffsep` shows black text on all four CMYK
plates.

I guess, this is what they call "preflight", as in checking whether
the separations are coming out right - and again, I'm not sure how
reliable `gs` with `tiffsep` is; but I don't know of any other tool in
Linux that could open a PDF/PS and show expected CMYK separations; so
if anyone has any alternatives to `gs` with `tiffsep` on Linux, please
write back. Then again, there is the problem that the printer guy may
not necesarilly obtain the same CMYK separations as I do (regardless
of the software I use to render these) - but at least, for now `gs`
with `tiffsep` offers at least a starting point...

Possibly, the problem may end up boiling down to `gs` with `tiffsep`,
as a "preflight" software - *and* my printer's actual setup - may
choose to send (gray) RGB values (or even values declared as
Grayscale) to all four CMYK plates; while that may not be the case
with other print setups or shops (for the same PDF or PS file). Which
is why in that case, the best for me would be to explicitly try to
convert gray R=G=B values into CMY:0+K values, instead of into
Grayscale?

(anecdote that may confirm this: these days I had a more-less textual
content document from `pdflatex`, split into ranges with `pdftk`,
printed on an office laser printer [don't know the brand] - apparently
something in the laser printer was misaligned, and I could see blueish
[though not 'actual' cyan] outline leaking less than 1 mm 'northwest'
of each and every letter; don't know if that would be a proof of RGB
text black being interpreted in that chain [I guess they used Windows
or Mac to print] as 'rich' black, i.e. C:1 M:1 Y:1 K:1).
Post by Helge Blischke
I wasn't able to do this conversion in a single pass using the
Ghostscript PDF interpreter, because it uses the setrgbcolor
directly from systemdict, so it doesn't allow for replacement.
Instead I first converted your files to PostScript using the
gs -sDEVICE=ps2write -sOutputFile=./out.ps ./blah-slide.pdf
Thanks for that, Ken - good to know some constructs dont allow
replacements..
Post by Helge Blischke
Then I created a simple replacement routine, and stored it in a
... [snip] ...
This replaces the setrgbcolor operator with a routine which
tests the RGB value and if all components are equal it replaces
it with a call to setgray using just one of the components. (BTW
you can remove the line ending in 'print', its just there so
that you can see something is happening ;-)
gs -sDEVICE=pdfwrite -sOutputFile=./out.pdf ./HackRGB.ps ./out.ps
Many thanks for the commented code (and the tip for `print` for
debugging postscript - and the example of how to use an 'external'
postscript routing with `ghostscript` :) ); this is what I tried:


# I had to add -dNOPAUSE -dBATCH to avoid having
# '>>showpage, press <return> to continue<<' and
# the prompt 'GS>' shown...

gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB.pdf ./HackRGB.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB.pdf && eog
p01.tif 2>/dev/null


Sadly, similarly to the use of the previous forceblack.ps, I again get
all four separations here showing letters...
Post by Helge Blischke
This results in a PDF file where the text is in a shade of gray.
This *ought* to be acceptable to your print shop, because gray
should map straight to the K channel of CMYK.
Yes - but, as I commented previously: if the process that they have at
the printer's shop behaves the same as `gs` with `tiffsep`, then
they'll still see what I see - that is, the black for text letters
showing on all four CMYK plates.
Post by Helge Blischke
If for some reason that isn't acceptable, you could replace the
'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK
directly.
Ahh - thanks for that; now that looks very promising to me :) !

I did the replacement, called that version HackRGB-cmyk.ps, and tried
this:


gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null


... and - partial success here: CMY plates are *finally* blank white -
but the K plate is inverted (what should be white background, is shown
in black; and the letters are grayer than that) :) Interestingly, the
same effect is shown if I open blah-slide-hackRGB-cmyk.pdf in
`evince`, too. Also interestingly, if I use the `pdftops` output (blah-
slide-tops.ps), then the final pdf is not inverted - but the
separations show again black text on all four CMYK plates:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-tops.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null

... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!


Then, since the whole problem seems to be a simple matter of
calculating K=1-R (instead of using K=R), I tried to modify the script
Post by Helge Blischke
PostScript, The Forgotten Art of Programming | Linux Journal - http://www.linuxjournal.com/article/2386
http://homepage.mac.com/andykopra/pdm/tutorials/an_introduction_to_postscript.html
... it seems to have worked :) First, I tried use ghostscript in
command line mode, reconstructing a simple stack and pasting
modifications of the HackRGB-cmyk.ps so I could see what I was writing
- as a noob note, those commands are here:

http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps

... and here is what ghostscript writes on output:

http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps.log


Finally, this is what the modified HackRGB-cmyk.ps looks like:

http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps

... the difference from HackRGB-cmyk.ps being:

- 0 0 0 4 -1 roll setcmykcolor
+ 0 0 0 4 -1 roll -1 mul 1 add setcmykcolor

... along with an added piece of code that will do the same for
setgray values.


To finally confirm all is OK, I run:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk-inv.pdf ./HackRGB-cmyk-inv.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk-inv.pdf &&
eog p01.tif 2>/dev/null

... and now, I do get black text on white background only in the K
plate, while CMY plates are blank white :) Although, I'm not sure how
accurate of a formula K=1-R is; if someone can suggest a more accurate
formula, please write back!
Post by Helge Blischke
NOTE! This only affects linework (text, vectors), only affects
linework using RGB and will only ocnvert that to gray if the R,
G and B values are identical. Images, shadings and potentially
other object types will not be affected.
Thanks for that - for color images, I anyway have to pay all four
inks, so for them, maybe 'rich' black is preferable; for shadings -
yeah, will have to look into that once I get a problem with it :)
Post by Helge Blischke
I should also mention that going from PDF to PostScript and back
to PDF is a potentially lossy process which can introduce errors
and odd artefacts, you should check files carefully after this
conversion. I haven't tested this code particularly.
Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); I guess hyperlinks would be gone
too - but it doesn't matter really; as this is a document specifically
intended for a print shop :)
Post by Helge Blischke
If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.
I haven't really tried this, but as far as I can see, there are
several PostScript dialects - and I'm not sure if I, say, export from
OpenOffice to PS directly, if it is guaranteed that the it will
feature /setrgbcolor type syntax (instead of the /DeviceRGB, /
setcolorspace type syntax). Which is why it's good to know that `gs` +
`ps2write` would help in such a case regardless :)
Post by Helge Blischke
You only get CMYK output if you have an interpreter which
applies the ICC (via a Colour Management System) profile to
create CMYK. In general you won't get this.
What usually happens is that you get a PDF which contains
colours in an ICCBased colour space. Which your print shop
probably won't like either.
Or possibly the colours still specified in RGB, but an
OutputProfile attached, which simply describes the RGB space for
which these were intended. A fully ICC ompliant workflow (ie
including your printer) would be able to create a link from the
ICC profile in the PDF to the ICC profile used for the printing
device, and everything would magically work out. This is rare,
it usually only works on closed workflows (that is, not
accepting submissions from the outside world)
Thanks for this - I can see that I still fail to understand properly
how ICC profiles really work; however, I hope with the solution above
I won't need to :) Especially thanks for the 'closed workflow' comment
- I was suspecting that may be the case, but I'm not that involved
with the industry to have actual experience of the kind...
Post by Helge Blischke
PostScript is a Turing-complete programming language. While
there are thigns that are hard to do, very little is impossible.
Heh - as soon as I saw this, I started reading up on it a bit - and
'oh dear' - there is a LOT of history involved in this, and reverse
Polish Notation doesn't make it any easier :) But I was glad I could
Post by Helge Blischke
/sys_setcolorspace /setcolorspace load def
/setcolorspace {
...
/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {
\let\Oldincludegraphics\includegraphics
\renewcommand{\includegraphics}[1]{\Oldincludegraphics[width=\maxwidth]{#1}}
... in Latex, no? :)


Anyways - thanks again everyone for the help with this problem,
Cheers!
ken
2011-06-09 13:00:39 UTC
Permalink
In article <da825ba1-f2f2-4ffe-8ea2-5cd3b4518e73
@p13g2000yqh.googlegroups.com>, ***@imi.aau.dk says...
Post by sdaau
Post by ken
If for some reason that isn't acceptable, you could replace the
'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK directly.
Ahh - thanks for that; now that looks very promising to me :) !
I did the replacement, called that version HackRGB-cmyk.ps, and tried
gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null
... and - partial success here: CMY plates are *finally* blank white -
but the K plate is inverted (what should be white background, is shown
in black; and the letters are grayer than that) :)
Oops, my fault, try '0 0 0 4 -1 roll 1 exch sub' instead. Gray is
inverse polarity and so 1 setgray produces white while 0 setgray
produces black. Subtracting from 1 will yield the reverse result. which
should work.
Post by sdaau
Interestingly, the
same effect is shown if I open blah-slide-hackRGB-cmyk.pdf in
`evince`, too. Also interestingly, if I use the `pdftops` output (blah-
slide-tops.ps), then the final pdf is not inverted - but the
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-tops.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null
... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!
Presumably because there are no colours specified using setrgbcolor....
Post by sdaau
- 0 0 0 4 -1 roll setcmykcolor
+ 0 0 0 4 -1 roll -1 mul 1 add setcmykcolor
... along with an added piece of code that will do the same for
setgray values.
Mea culpa, I forgot to invert the gray values. Still, well done on
sorting out the setgray yourself !
Post by sdaau
... and now, I do get black text on white background only in the K
plate, while CMY plates are blank white :) Although, I'm not sure how
accurate of a formula K=1-R is; if someone can suggest a more accurate
formula, please write back!
It only occurs (in the code I wrote at least) when R=G=B, so its a shade
of gray. In that case it doesn't matter what component you choose, they
are all the same :-)
Post by sdaau
Post by ken
I should also mention that going from PDF to PostScript and back
to PDF is a potentially lossy process which can introduce errors
and odd artefacts, you should check files carefully after this
conversion. I haven't tested this code particularly.
Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); I guess hyperlinks would be gone
too - but it doesn't matter really; as this is a document specifically
intended for a print shop :)
pswrite is *really* basic, ps2write does a much better job. All text is
converted to outlines by pswrite, which is one resaon the output tends
to be huge. There are many other compromises too.
Post by sdaau
Post by ken
If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.
I haven't really tried this, but as far as I can see, there are
several PostScript dialects
Not really. There are three basic levels, 1 to 3 where 1 is ancient, 2
is quite old and 3 is (comparatively) new. Most producers only create
level 2 PostScript anyway, which will run on any printer you can buy, or
could have bought in the last 10 years probably.

Only specialist DTP applications like Quark XPress, Adobe Illustrator,
IDesign etc generally create level 3 PostScript. However, if you start
form a PDF file and convert to PostScript using something like Acrobat,
it may offer to save as level 3. Generally that should be fine, but you
might want to check that one with your printer if you ever find yourself
doing it.

OTOH If they can accept PDF I'd be amazed if their Rip couldn't also
handle level 3 PostScript.



Ken
sdaau
2011-06-09 13:48:15 UTC
Permalink
Post by ken
Post by sdaau
... and - partial success here: CMY plates are *finally* blank white -
but the K plate is inverted (what should be white background, is shown
in black; and the letters are grayer than that) :)
Oops, my fault, try '0 0 0 4 -1 roll 1 exch sub' instead. Gray is
inverse polarity and so 1 setgray produces white while 0 setgray
produces black. Subtracting from 1 will yield the reverse result. which
should work.
Awesome - thanks for the '1 exch sub' construct - read up on
http://www.tailrecursive.org/postscript/operators.html#exch and I
think I see how it works :)
Post by ken
Still, well done on
sorting out the setgray yourself !
Cheers - wouldn't have done it if I wasn't encouraged by your code
comments :)
Post by ken
Post by sdaau
... Although, I'm not sure how
accurate of a formula K=1-R is; if someone can suggest a more accurate
formula, please write back!
It only occurs (in the code I wrote at least) when R=G=B, so its a shade
of gray. In that case it doesn't matter what component you choose, they
are all the same :-)
Yes, but I meant more from a perceptual perspective: for instance, I
am pretty certain that black as (CMY)K:(0,0,0),1 on paper should
correspond to RGB:0,0,0 on screen; and that white as (CMY)K:(0,0,0),0
should correspond to RGB:1,1,1. However, would RGB:0.8,0.8,0.8 map
linearly to K:0.2 - or are there some 'transformations' involved, when
mapping perception of grayscale from screen to paper (e.g. instead of
K=1-R, may be something like K=1-0.2*(5^R) would be more
appropriate) ?
Post by ken
Post by sdaau
Post by ken
I should also mention that going from PDF to PostScript and back
to PDF is a potentially lossy process which can introduce errors
and odd artefacts, you should check files carefully after this
conversion. I haven't tested this code particularly.
Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); ...
pswrite is *really* basic, ps2write does a much better job. All text is
converted to outlines by pswrite, which is one resaon the output tends
to be huge. There are many other compromises too.
Just to make sure - I was using the ps2write (not pswrite) in the
example above, and that also seems to 'flatten' the text (although, as
I noted, I don't mind that, and the other compromises - as long as the
print comes out nice :) )..
Post by ken
Post by sdaau
Post by ken
If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.
I haven't really tried this, but as far as I can see, there are
several PostScript dialects
Not really. There are three basic levels, 1 to 3 where 1 is ancient, 2
is quite old and 3 is (comparatively) new. Most producers only create
level 2 PostScript anyway, which will run on any printer you can buy, or
could have bought in the last 10 years probably.
Only specialist DTP applications like Quark XPress, Adobe Illustrator,
IDesign etc generally create level 3 PostScript. However, if you start
form a PDF file and convert to PostScript using something like Acrobat,
it may offer to save as level 3. Generally that should be fine, but you
might want to check that one with your printer if you ever find yourself
doing it.
OTOH If they can accept PDF I'd be amazed if their Rip couldn't also
handle level 3 PostScript.
Thanks for noting that; good to have the notion, that level 2 should
still be generally safe to use.

I guess 'dialects' was the wrong word to use; when I wrote that, I was
Post by ken
Post by sdaau
... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!
Presumably because there are no colours specified using setrgbcolor....
... if not a different dialect, then sure there seems to be different
ways of specifying color: for instance, depending on how a conversion
from PDF to PS is performed, the PS file may or may not specify colors
using setrgbcolor. And I guess, that is what would limit the usability
of a script like HackRGB.ps?


Thanks again for a great discussion,
Cheers!
ken
2011-06-09 15:06:50 UTC
Permalink
In article <fd01f57e-e9b3-4839-9ba8-5c732bcfc506
@m4g2000yqk.googlegroups.com>, ***@imi.aau.dk says...
Post by sdaau
Post by ken
It only occurs (in the code I wrote at least) when R=G=B, so its a shade
of gray. In that case it doesn't matter what component you choose, they
are all the same :-)
Yes, but I meant more from a perceptual perspective: for instance, I
am pretty certain that black as (CMY)K:(0,0,0),1 on paper should
correspond to RGB:0,0,0 on screen; and that white as (CMY)K:(0,0,0),0
should correspond to RGB:1,1,1. However, would RGB:0.8,0.8,0.8 map
linearly to K:0.2 - or are there some 'transformations' involved, when
mapping perception of grayscale from screen to paper (e.g. instead of
K=1-R, may be something like K=1-0.2*(5^R) would be more
appropriate) ?
If you're worried about colour fidelity, you shouldn't be using an
application which produces RGB to produce documents for print ;-)

This has long been a criticism of Microsoft Office and Publisher, people
who care about colour want (at the very least!) to be able to specify
CMYK colours, not RGB. The whole colour model is different (reflective
vs transmissive).

Seriously, I wouldn't worry about it too much, I expect it'll be close
eough for your.
n
Post by sdaau
Post by ken
Post by sdaau
Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); ...
pswrite is *really* basic, ps2write does a much better job. All text is
converted to outlines by pswrite, which is one resaon the output tends
to be huge. There are many other compromises too.
Just to make sure - I was using the ps2write (not pswrite) in the
example above, and that also seems to 'flatten' the text (although, as
I noted, I don't mind that, and the other compromises - as long as the
print comes out nice :) )..
ps2write really should never convert text to outlines, worst case it
might produce bitmaps instead of scalable fonts. I don't think there's
any way it can convert to outlines (I ha d arecent request for that, so
I'm reasonably sure ;-)
Post by sdaau
Thanks for noting that; good to have the notion, that level 2 should
still be generally safe to use.
I guess 'dialects' was the wrong word to use; when I wrote that, I was
Post by ken
Post by sdaau
... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!
Presumably because there are no colours specified using setrgbcolor....
... if not a different dialect, then sure there seems to be different
ways of specifying color: for instance, depending on how a conversion
from PDF to PS is performed, the PS file may or may not specify colors
using setrgbcolor.
Well PostScript is a programming language; there are usually multiple
ways of achieving the same end in programming languages. Some may be
preferable.

One reason to use "/DeviceRGB setcolorspace R G B setcolor" instead of
"R G B setcolorspace" would be if you were going to specify lots of
colours. Saving the 5 bytes per time of setcolor vs setrgbcolor can
mount up if you do lots of them leading to smaller files. These days
nobody really cares much about that ;-)

Also there are a number of operators which use the current color space,
so if you are going to be working in RGB its often more efficient to set
the colour space to RGB, and then just go. Same for other spaces of
course.

Microsoft Office used to (probably still does) create patterns by
drawing lots of teeny tiny images in an Indexed (ie palette) colour
space. it was hideously inefficient because it would set the current
colour space to RGB then save the graphics state, set the colour space
to the paletted space and draw the image, then restore back to
DeviceRGB, rinse and repeat.

The RIP I was working on at the time did a certain amount of work
whenever the colour space changed. By switching inanely back and forth
like that the files took a long time to process. We eventually added a
cache to cater for the situation.

Setting the colour space to the paletted colour, drawing all the images
and then restoring back would have been *much* more efficient....

But concerns like those went away when people stopped sending PostScript
files to RIPs using a 9,600 Kbits/sec serial interface :-)
Post by sdaau
And I guess, that is what would limit the usability
of a script like HackRGB.ps?
You could sitll do it. You would need to monitor calls to setcolor
instead of setrgbcolor, check the current colour space and if its
/DeviceRGB check the three components. If they are the same then you
could set the colour space to Gray, and call setcolor with one
component. You would need to remember what the last colour space was, so
that on the next call to setcolor you could restore the original space
first. Obviously you would also monitor setcolorspace calls in case the
space changed after the last setcolor.

Clearly the complexity of the challenge goes up, but its still possible.


Ken
tlvp
2011-09-16 22:33:00 UTC
Permalink
Post by sdaau
Hi all,
Many, many thanks for the assistance with this problem! I believe it
is more or less solved now - somewhat of a mammoth post follows, but
* forceblack.ps uses `pdftops` PS file, manipulates /DeviceRGB, /
setcolorspace
* HackRGB.ps uses `gs` ps2write PS file, manipulates /setrgbcolor, /
setgray
Post by Helge Blischke
If you convert your OOo generated PDFs to PostScript using
pdftops (from the xpdf suite) and then prepend the attached
forceblack.ps to the resulting PostScript file, RGB colors where
R==G==B will be printed as pure black (replaced with the
appropriate gray value).
Note that this trick won't work with PDF input or PostScript
produced using Ghostscript's ps2write device.
Thanks for that, Helge Blischke; here is a command line log of what I
pdftops blah-slide.pdf blah-slide-tops.ps
cat forceblack.ps blah-slide-tops.ps > blah-slide-forceblack.ps
# blah-slide-forceblack.ps has wrong page size in evince!
sed -n '/\/pdfE/{p}' blah-slide.ps
## sed without -n will output entire input file
## the 'r' command reads in forceblack.ps, and
## adds/inserts it after the matching line
## http://www.grymoire.com/Unix/Sed.html#uh-0
sed '/\/pdfE/r forceblack.ps' blah-slide-tops.ps > blah-slide-
forceblack.ps
# now blah-slide-forceblack.ps is the correct page size!
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-forceblack.ps && eog
p01.tif 2>/dev/null
... and again, `gs` with `tiffsep` shows black text on all four CMYK
plates.
I guess, this is what they call "preflight", as in checking whether
the separations are coming out right - and again, I'm not sure how
reliable `gs` with `tiffsep` is; but I don't know of any other tool in
Linux that could open a PDF/PS and show expected CMYK separations; so
if anyone has any alternatives to `gs` with `tiffsep` on Linux, please
write back. Then again, there is the problem that the printer guy may
not necesarilly obtain the same CMYK separations as I do (regardless
of the software I use to render these) - but at least, for now `gs`
with `tiffsep` offers at least a starting point...
Possibly, the problem may end up boiling down to `gs` with `tiffsep`,
as a "preflight" software - *and* my printer's actual setup - may
choose to send (gray) RGB values (or even values declared as
Grayscale) to all four CMYK plates; while that may not be the case
with other print setups or shops (for the same PDF or PS file). Which
is why in that case, the best for me would be to explicitly try to
convert gray R=G=B values into CMY:0+K values, instead of into
Grayscale?
(anecdote that may confirm this: these days I had a more-less textual
content document from `pdflatex`, split into ranges with `pdftk`,
printed on an office laser printer [don't know the brand] - apparently
something in the laser printer was misaligned, and I could see blueish
[though not 'actual' cyan] outline leaking less than 1 mm 'northwest'
of each and every letter; don't know if that would be a proof of RGB
text black being interpreted in that chain [I guess they used Windows
or Mac to print] as 'rich' black, i.e. C:1 M:1 Y:1 K:1).
Post by Helge Blischke
I wasn't able to do this conversion in a single pass using the
Ghostscript PDF interpreter, because it uses the setrgbcolor
directly from systemdict, so it doesn't allow for replacement.
Instead I first converted your files to PostScript using the
gs -sDEVICE=ps2write -sOutputFile=./out.ps ./blah-slide.pdf
Thanks for that, Ken - good to know some constructs dont allow
replacements..
Post by Helge Blischke
Then I created a simple replacement routine, and stored it in a
... [snip] ...
This replaces the setrgbcolor operator with a routine which
tests the RGB value and if all components are equal it replaces
it with a call to setgray using just one of the components. (BTW
you can remove the line ending in 'print', its just there so
that you can see something is happening ;-)
gs -sDEVICE=pdfwrite -sOutputFile=./out.pdf ./HackRGB.ps ./out.ps
Many thanks for the commented code (and the tip for `print` for
debugging postscript - and the example of how to use an 'external'
# I had to add -dNOPAUSE -dBATCH to avoid having
# '>>showpage, press <return> to continue<<' and
# the prompt 'GS>' shown...
gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB.pdf ./HackRGB.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB.pdf && eog
p01.tif 2>/dev/null
Sadly, similarly to the use of the previous forceblack.ps, I again get
all four separations here showing letters...
Post by Helge Blischke
This results in a PDF file where the text is in a shade of gray.
This *ought* to be acceptable to your print shop, because gray
should map straight to the K channel of CMYK.
Yes - but, as I commented previously: if the process that they have at
the printer's shop behaves the same as `gs` with `tiffsep`, then
they'll still see what I see - that is, the black for text letters
showing on all four CMYK plates.
Post by Helge Blischke
If for some reason that isn't acceptable, you could replace the
'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK
directly.
Ahh - thanks for that; now that looks very promising to me :) !
I did the replacement, called that version HackRGB-cmyk.ps, and tried
gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null
... and - partial success here: CMY plates are *finally* blank white -
but the K plate is inverted (what should be white background, is shown
in black; and the letters are grayer than that) :) Interestingly, the
same effect is shown if I open blah-slide-hackRGB-cmyk.pdf in
`evince`, too. Also interestingly, if I use the `pdftops` output (blah-
slide-tops.ps), then the final pdf is not inverted - but the
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-tops.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null
... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!
Then, since the whole problem seems to be a simple matter of
calculating K=1-R (instead of using K=R), I tried to modify the script
Post by Helge Blischke
PostScript, The Forgotten Art of Programming | Linux Journal - http://www.linuxjournal.com/article/2386
http://homepage.mac.com/andykopra/pdm/tutorials/an_introduction_to_postscript.html
... it seems to have worked :) First, I tried use ghostscript in
command line mode, reconstructing a simple stack and pasting
modifications of the HackRGB-cmyk.ps so I could see what I was writing
http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps
http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps.log
http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps
- 0 0 0 4 -1 roll setcmykcolor
+ 0 0 0 4 -1 roll -1 mul 1 add setcmykcolor
... along with an added piece of code that will do the same for
setgray values.
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk-inv.pdf ./HackRGB-cmyk-inv.ps ./blah-slide-gsps2w.ps
# check separations
gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk-inv.pdf &&
eog p01.tif 2>/dev/null
... and now, I do get black text on white background only in the K
plate, while CMY plates are blank white :) Although, I'm not sure how
accurate of a formula K=1-R is; if someone can suggest a more accurate
formula, please write back!
Post by Helge Blischke
NOTE! This only affects linework (text, vectors), only affects
linework using RGB and will only ocnvert that to gray if the R,
G and B values are identical. Images, shadings and potentially
other object types will not be affected.
Thanks for that - for color images, I anyway have to pay all four
inks, so for them, maybe 'rich' black is preferable; for shadings -
yeah, will have to look into that once I get a problem with it :)
Post by Helge Blischke
I should also mention that going from PDF to PostScript and back
to PDF is a potentially lossy process which can introduce errors
and odd artefacts, you should check files carefully after this
conversion. I haven't tested this code particularly.
Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); I guess hyperlinks would be gone
too - but it doesn't matter really; as this is a document specifically
intended for a print shop :)
Post by Helge Blischke
If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.
I haven't really tried this, but as far as I can see, there are
several PostScript dialects - and I'm not sure if I, say, export from
OpenOffice to PS directly, if it is guaranteed that the it will
feature /setrgbcolor type syntax (instead of the /DeviceRGB, /
setcolorspace type syntax). Which is why it's good to know that `gs` +
`ps2write` would help in such a case regardless :)
Post by Helge Blischke
You only get CMYK output if you have an interpreter which
applies the ICC (via a Colour Management System) profile to
create CMYK. In general you won't get this.
What usually happens is that you get a PDF which contains
colours in an ICCBased colour space. Which your print shop
probably won't like either.
Or possibly the colours still specified in RGB, but an
OutputProfile attached, which simply describes the RGB space for
which these were intended. A fully ICC ompliant workflow (ie
including your printer) would be able to create a link from the
ICC profile in the PDF to the ICC profile used for the printing
device, and everything would magically work out. This is rare,
it usually only works on closed workflows (that is, not
accepting submissions from the outside world)
Thanks for this - I can see that I still fail to understand properly
how ICC profiles really work; however, I hope with the solution above
I won't need to :) Especially thanks for the 'closed workflow' comment
- I was suspecting that may be the case, but I'm not that involved
with the industry to have actual experience of the kind...
Post by Helge Blischke
PostScript is a Turing-complete programming language. While
there are thigns that are hard to do, very little is impossible.
Heh - as soon as I saw this, I started reading up on it a bit - and
'oh dear' - there is a LOT of history involved in this, and reverse
Polish Notation doesn't make it any easier :) But I was glad I could
Post by Helge Blischke
/sys_setcolorspace /setcolorspace load def
/setcolorspace {
...
/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {
\let\Oldincludegraphics\includegraphics
\renewcommand{\includegraphics}[1]{\Oldincludegraphics[width=\maxwidth]{#1}}
... in Latex, no? :)
Anyways - thanks again everyone for the help with this problem,
Cheers!
At the risk of wasting good keystrokes on a necro-thread, let me
nonetheless remark on my understanding of the encoding of black
in postscript (corrections welcomed if called for):

In the RGB colorspace, black is specified as 0 0 0 .
In the grayscale "colorspace", black is specified as 0 .
In the CMYK colorspace, black is specified either as 1 1 1 1 (rich)
or as 0 0 0 1 (pure).

In any event, it is clear that your grayscale blackness parameter and
your matching K value are complementary real numbers (whose sum is 1).

This may help explain why your correction (R to 1-R) was necessary.

Hoping this doesn't come too too late, I offer you cheers, -- tlvp
--
Avant de repondre, jeter la poubelle, SVP.
sdaau
2011-06-06 16:17:53 UTC
Permalink
Yes - well, thanks to pointers I got here, I tried several ways of
converting the same pdf into a .ps file, and they all give different
sort of files (different dialect of PostScript, I suppose ?!) :)

I was interested in getting a human readable output, and then trying
to change it by 'hand', and here's what I tried:


gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pswrite -
sOutputFile=01-gs.ps blah-slide.pdf

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=ps2write -
dASCII85EncodePages=false -sOutputFile=02-gs.ps blah-slide.pdf

pdf2ps -dASCII85EncodePages=false -sProcessColorModel=DeviceCMYK blah-
slide.pdf 03-2ps.ps

pdftops blah-slide.pdf 04-tops.ps


From the above, 02-gs.ps and 03-2ps.ps will turn out with compressed
data inside, hence not human editable. 01-gs.ps, generated by device
`pswrite` is 'human readable'; however has code like this:

$ grep -A 1 -i 'rgb\|device' 01-gs.ps
{ pop/setpagedevice where
{ pop 1 dict dup /PageSize PageSize put setpagedevice}
{ /setpage where{ pop PageSize aload pop pageparams 3 {exch pop}
repeat
--
/rG{3{3 -1 roll 255 div}repeat setrgbcolor}!/G{255 div setgray}!/K{0
G}!
/r6{dup 3 -1 roll rG}!/r5{dup 3 1 roll rG}!/r3{dup rG}!

... and I don't see anything resembling RGB coordinates here :)

Turns out, the 04-tops.ps (generated by pstopdf - as I learned, from
xpdf) has the right output:

$ grep -A 1 -i 'cmyk\|rgb\|device' 04-tops.ps
/setpagedevice where {
pop 3 dict begin
--
currentdict end setpagedevice
} {
--
/DeviceGray {} cs
[0] sc
/DeviceGray {} CS
[0] SC
--
/DeviceRGB {} cs
[1 1 1] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc


So what I did was open this 04-tops.ps manually in nano, and change
all instances of DeviceRGB and DeviceGray to DeviceCMYK - and re-
mapping the values accordingly, as in:

$ grep -A 1 -i 'cmyk\|rgb\|device' 04-tops.ps
/setpagedevice where {
pop 3 dict begin
--
currentdict end setpagedevice
} {
--
/DeviceCMYK {} cs
[0 0 0 1] sc
/DeviceCMYK {} CS
[0 0 0 1] SC
--
/DeviceCMYK {} cs
[0 0 0 0] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc
--
/DeviceCMYK {} cs
[0 0 0 1] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc

And now - one could see that say, evince took much longer to open the
file; it seems that the final two RGB colors [0.2353 0.2353 0.2353]
are used to provide the black color for the two pieces of text in the
pdf (tested through the before-last [0 0 0 1] CMYK), and just copying
this value (0.2353) to K is not correct (gives a very weak gray).

However, the best part is that now, FINALLY, `tiffsep` shows colors
only in the K channel:

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif 04-tops.ps && eog p00000001.tif

and now even if I convert this ps to pdf:

ps2pdf 04-tops.ps

I again get correct tiffseps that show black only in K:

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif 04-tops.pdf && eog p00000001.tif

Funny thing - `identify` will recognize *both* 04-tops.ps and 04-
tops.pdf as RGB :)

Well, I guess I could cook myself up a python script iterating through
all of these colors, and calculate and replace CMYK values, however:

* Is there a guarantee that pdftops will always generate this kind of
syntax with /DeviceRGB, regardless of what PDF I throw at it?
* What happens if /DeviceRGB is not attributed to something like color
coordinates (say, if it is attributed to an image) - if that's
possible at all?

In the end, I'm guessing with a proper command line, ghostscript
should be able to do this - however, only time I got some success
(report of CMYK by `identify`) there had to be an image present; plus
seemingly it doesn't handle the 000->0001 mapping (which maybe color
profiles would address?)

Anyways, I'd love to hear some comments on this,
Thanks,
Cheers!



PS: One more (maybe) relevant note:

HOWTO Convert a ps file to CMYK - http://www.met.rdg.ac.uk/~dan/work/H2ConvertToCMYK.html
As far as I know gs (ghostscript) doesn't support CMYK postscript.
So, doing:

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pswrite -
sProcessColorModel=DeviceCMYK -sOutputFile=01b-gs.ps blah-slide.pdf

... results with: "Unrecoverable error: rangecheck
in .putdeviceprops". However, the same error appears also for
sProcessColorModel=DeviceGray; yet, it doesn't appear for -
sColorConversionStrategy=CMYK (replacing -
sProcessColorModel=DeviceCMYK), but there seems no significant
difference...
Helge Blischke
2011-06-06 17:30:45 UTC
Permalink
sdaau wrote:

[...]

If you convert your OOo generated PDFs to PostScript using pdftops (from the
xpdf suite) and then prepend the attached forceblack.ps to the resulting
PostScript file, RGB colors where R==G==B will be printed as pure black
(replaced with the appropriate gray value).

Note that this trick won't work with PDF input or PostScript produced using
Ghostscript's ps2write device.

Helge
o***@gmail.com
2012-01-25 14:40:19 UTC
Permalink
Hi Helge

Your "recipe" works very nicely - thanks a lot for that - except for one refinement that would probably be nice:

When black text is printed on a background color other than white, in my case a light cyan, that background color is carved out to white; and as far as I remember from my prepress-time, it is difficult for the printer to avoid white margins around the glyphs if the background color is white and not not the same under the glyps as around them also.

Can this be taken into consideration e.g. in forceblack.ps? (I unfortunately don't feel competent in Postscript to quickly figure that out myself.)

Thanks a lot for your time!
Ole
Helge Blischke
2012-01-25 15:09:13 UTC
Permalink
Post by o***@gmail.com
Hi Helge
Your "recipe" works very nicely - thanks a lot for that - except for one
When black text is printed on a background color other than white, in my
case a light cyan, that background color is carved out to white; and as
far as I remember from my prepress-time, it is difficult for the printer
to avoid white margins around the glyphs if the background color is white
and not not the same under the glyps as around them also.
Can this be taken into consideration e.g. in forceblack.ps? (I
unfortunately don't feel competent in Postscript to quickly figure that
out myself.)
Thanks a lot for your time!
Ole
The knockout of your background color should only occur when you printer
(what make and model?) is generating color separations.
In any case, you could you could try to insert the statement
true setoverprint
just before every of the "}bind def" lines in the forceblack.ps script.
But there is no guarantee that this works.

Helge
o***@gmail.com
2012-01-26 14:41:50 UTC
Permalink
Thanks a lot for quick reply!

Tried the "true setoverprint" as you mentioned, but didn't help. I'm using the tiffsep ghostscript device for checking like sdaau did.

I'm wondering if it made a difference to first convert the rgb-PDF to cmyk and then apply your procedure? It appears Postscript defines overprinting only for subtractive color models like cmyk, not for additive ones like rgb. And forceblack.ps is assuming an rbg color model, isn't it? Again, feel not in a position to quickly modify forceblack.ps to cmyk colorspace. Would that be a big thing for you?-)

Thanks again!
Ole
Helge Blischke
2012-01-26 20:07:59 UTC
Permalink
Post by o***@gmail.com
Thanks a lot for quick reply!
Tried the "true setoverprint" as you mentioned, but didn't help. I'm using
the tiffsep ghostscript device for checking like sdaau did.
I'm wondering if it made a difference to first convert the rgb-PDF to cmyk
and then apply your procedure? It appears Postscript defines overprinting
only for subtractive color models like cmyk, not for additive ones like
rgb. And forceblack.ps is assuming an rbg color model, isn't it? Again,
feel not in a position to quickly modify forceblack.ps to cmyk colorspace.
Would that be a big thing for you?-)
Thanks again!
Ole
Well, the issue of the original poster indeed was RGB only. I just modified
the script to handle both RGB and CMYK color spaces. Try it out and let me
know if it works as expected.

Helge
o***@gmail.com
2012-01-26 20:57:12 UTC
Permalink
Hi Helge

Found the solution in the answer sdaau posted on stackoverflow (see http://stackoverflow.com/questions/6248563/converting-any-pdf-to-black-k-only-cmyk/9024346#9024346).

After converting original rgb pdf to ps with pdftops, convert ps to cmyk pdf with ghostscript pdfwrite first prepending sdaau's HackRGB-cmyk-inv.ps (http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps), then prepending your forceblack.ps with the "true setoverprint" extension. This will overprint black text on existing background color:

pdftops -level2 test.pdf test.ps
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -sOutputFile=test-cmyk-k-overprint.pdf HackRGB-cmyk-inv.ps forceblack.ps test.ps

Thanks again,
Ole
o***@gmail.com
2012-01-26 20:58:52 UTC
Permalink
Hi Helge

Found the solution in the answer sdaau posted on stackoverflow (see http://stackoverflow.com/questions/6248563/converting-any-pdf-to-black-k-only-cmyk/).

After converting original rgb pdf to ps with pdftops, convert ps to cmyk pdf with ghostscript pdfwrite first prepending sdaau's HackRGB-cmyk-inv.ps (http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps), then prepending your forceblack.ps with the "true setoverprint" extension. This will overprint black text on existing background color:

pdftops -level2 test.pdf test.ps
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -sOutputFile=test-cmyk-k-overprint.pdf HackRGB-cmyk-inv.ps forceblack.ps test.ps

Thanks again,
Ole
r***@gmail.com
2013-02-27 15:10:27 UTC
Permalink
Hello all!

This last script work very well for my project!
thanks

But I got a problem.
How can I convert the rgb magenta to 100% magenta?
Post by o***@gmail.com
Hi Helge
Found the solution in the answer sdaau posted on stackoverflow (see http://stackoverflow.com/questions/6248563/converting-any-pdf-to-black-k-only-cmyk/).
pdftops -level2 test.pdf test.ps
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -sOutputFile=test-cmyk-k-overprint.pdf HackRGB-cmyk-inv.ps forceblack.ps test.ps
Thanks again,
Ole
r***@gmail.com
2013-02-27 15:36:54 UTC
Permalink
Post by sdaau
Hi all,
I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.
So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
100)? I posted a similar question on
http://stackoverflow.com/questions/6241282/converting-pdf-to-cmyk-with-identify-recognizing-cmyk
... although that question is more Latex oriented. So here, I'll try
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click File/Export as PDF; call this PDF blah-slide.pdf
At this point, close and reopen OpenOffice, for yet another slide
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click Insert/Picture/From File... and insert whatever PNG image
** I used `convert -size 10x10 xc:red img.png` to generate a PNG image
to insert
* Click File/Export as PDF; call this PDF blah-slideP.pdf
At this point, we can run ImageMagick's `identlfy` on both pdf's, and
$ identify -verbose blah-slide.pdf | grep -i 'type\|color'
Type: Grayscale
Base type: Grayscale
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
$ identify -verbose blah-slideP.pdf | grep -i 'type\|color'
Type: TrueColor
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
Now, I'm aware that `identify` in principle works on raster images,
but I cannot find any other application that will provide similar
color information for PDFs (any other suggestions?)
Furthermore, the only check I have for CMYK separations for now (any
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide.pdf && eog p00000001.tif
(or)
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP.pdf && eog
p00000001.tif
Of course, both of these show that the black color of the text is
'rich' black - on all four CMYK plates - instead of a plain 'black',
just in the K channel...
//////
So, now I finally try the command line I found in
http://www.productionmonkeys.net/guides/ghostscript/examples for
converting, as it says, "Color PDF to CMYK" - for both of these PDFs
$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slide-out.pdf blah-slide.pdf
$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slideP-out.pdf blah-slideP.pdf
.. And here is now the interesting thing - if I try to run `identify`
again - *only* the pdf containing an image is the one recognized as
$ identify -verbose blah-slide-out.pdf | grep -i 'type\|color'
Type: Palette
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
$ identify -verbose blah-slideP-out.pdf | grep -i 'type\|color'
Type: ColorSeparation
Base type: ColorSeparation
Colorspace: CMYK
Background color: white
Border color: cmyk(223,223,223,0)
Matte color: grey74
Transparent color: black
However, regardless of how they are reported, if I try to view their
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
p00000001.tif
$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
p00000001.tif
... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.
* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
* Why do I need an image in the slide, so that `identify` recognizes
the "converted" CMYK pdf as being really CMYK?
my question on stackoverflow:
http://stackoverflow.com/questions/15115990/convert-rgb-pdf-to-cmyk-keep-100-k-black-and-100-mmagenta-on-linux

regards
Post by sdaau
(* Are there any other alternative free tools for: conversion of RGB
to CMYK pdf; and: checking the print separations of any PDF?)
As a final note: I guess this kind of thing may have something to do
(and be achievable) with ICC profiles, which unfortunately I don't
understand very much - and I've had a lot of problems finding example
command lines; so if there is such a solution, an example command line
will be much appreciated.
Thanks in advance for any responses,
Cheers!
r***@gmail.com
2013-02-28 18:06:20 UTC
Permalink
Thanks to the guys on stackoverflow follow the answers:

update the HackRGB-cmyk to it:

%!
/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {
(in replacement setrgbcolor\n) print
%% R G B
1 index 1 index %% R G B G B
eq { %%
2 index 1 index %% R G B R B
eq {
%% Here if R = G = B
pop pop %% remove two values
% setgray % "replace the 'setgray' with":
0 0 0 4 -1 roll % setcmykcolor
-1 mul %% obtain -R on top of stack
1 add %% obtain 1-R on top of stack
setcmykcolor %% now set(cmykcolor) K (as 1-R)
} {
oldsetrgbcolor %% set the RGB values
} ifelse
}{
oldsetrgbcolor %% Set the RGB values
currentcmykcolor %puts 4 numbers on the stack
(cmyk-) print pstack %display the colors (remove when things work correctly)
3 -1 roll %put magenta on top of stack
dup %make copy of magenta value
.5 %put magenta test value on stack (then may not be exactly .5, see pstack)
eq %see of magenta is equal to test value (.5)
{pop 1}if %if it is equal, pop off the .5 and put a 1 onto the stack
3 1 roll %put magenta back where it belongs in the stack
setcmykcolor %reset the cmyk to have new magenta value
}ifelse

} bind def
/oldsetgray /setgray load def
/setgray {
(in replacement setgray\n) print
% == % debug: pop last element and print it
% here we're at a gray value;
% http://www.tailrecursive.org/postscript/operators.html#setcymkcolor
% setgray: "gray-value must be a number from 0 (black) to 1 (white)."
% setcymkcolor: "The components must be between 0 (none) to 1 (full)."
% so convert here again:
0 0 0 4 -1 roll % push CMY:000 after Gray and roll down,
% so top of stack becomes
% ...:C:M:Y:Gray
-1 mul %% obtain -Gray on top of stack
1 add %% obtain 1-Gray on top of stack
setcmykcolor %% now set(cmykcolor) K (as 1-Gray)
} bind def


%~ # test: rgb2gray
%~ gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-hackRGB-gray.ps ./HackRGB.ps ./blah-slide-gsps2w.ps
%~ # gray2cmyk
%~ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-hackRGB-gray-ci.pdf ./HackRGB-cmyk-inv.ps ./blah-slide-hackRGB-gray.ps
%~ # check separations - looks OK
%~ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-gray-ci.pdf && eog p01.tif 2>/dev/null
Loading...