How to get the level of nested Xobjects in a pdf file using iTextSharp.
[Fabrizio Accatino - [email protected]]
Using iTextSharp (or iText) it's possible to walk through the pdf objects. In particular, it's possible to detect the presence of a XObject, "open" it and walk through its internal objects. A pdf file is a collection of objects (texts, lines, rectangles, images, etc.) placed on its pages. A XObject is a particular object: it is container of objects. An xobject can be placed on a single page or on more pages but its content is stored once in the pdf file. Xobjects are very useful to repeat a complex and heavy content on many pages, storing it once in the pdf. An xobject can contain one or many xobject. Every contained xobjects can contains other xobject and so on. I don't know if the Pdf Reference has a limit on the maximum number of nested object levels. If you print a pdf with many nested xobject on a PostScript printer you can get a fatal error: ERROR:limitcheck OFFENDING COMMAND: gsave STACK: -savelevel- This is due to the fact that every xobject is "translated" to PostScript with a couple of gsave + grestore commands. A gsave command saves the current graphics status into the buffer (stack). The grestore restores the graphic status. But the buffer (every buffer) is limited. When it is full, you get the error. Tipically a postscript RIPs can manage about 15 levels of nested objcts. Using PdfCheckXObjectLevels.Checker you can get informations about level of nested xobject in a pdf. At the moment the source code is a very very beta. (suggestions / feedback are very welcome) Usage:
PdfCheckXObjectLevels.Checker chk = new PdfCheckXObjectLevels.Checker();
chk.Exec("simple.pdf");
Console.WriteLine(chk.MaxLevel);
The class:
using System;
using System.Collections.Generic;
using System.Text;
using iTextSharp.text.pdf;
namespace PdfCheckXObjectLevels
{
public class Checker
{
private int _currentLevel = 0;
private int _maxLevel;
private int _pageMaxLevel = 0;
private List<string> _messages;
private int[] _pagesMaxLevel;
public int MaxLevel { get { return _maxLevel; } }
public int[] PagesMaxLevel { get { return _pagesMaxLevel; } }
public string[] Messages { get { return _messages.ToArray(); } }
public int Exec(string pdfFileName)
{
_maxLevel = 0;
_messages = new List<string>();
PdfReader reader = new PdfReader(pdfFileName);
_pagesMaxLevel = new int[reader.NumberOfPages];
for (int p = 1; p <= reader.NumberOfPages; p++)
{
_pageMaxLevel = 0;
_messages.Add("Page: " + p);
PdfDictionary pageDict = reader.GetPageN(p);
PdfObject res = pageDict.Get(PdfName.RESOURCES);
if (res.IsIndirect())
{
PRIndirectReference pirr = res as PRIndirectReference;
PdfDictionary newRes = pirr.Reader.GetPdfObject(pirr.Number) as PdfDictionary;
res = newRes;
}
if (res != null)
InternalDictCheck(res as PdfDictionary);
else
{
// Non ci sono risorse??? Possibile????
}
_pagesMaxLevel[p - 1] = _pageMaxLevel;
}
reader.Close();
return _maxLevel;
}
private void InternalDictCheck(PdfDictionary dict)
{
_currentLevel++;
_messages.Add(" CurrentLevel: " + _currentLevel);
if (_currentLevel > _maxLevel)
_maxLevel = _currentLevel;
if (_currentLevel > _pageMaxLevel)
_pageMaxLevel = _currentLevel;
PdfDictionary xobjDict = dict.Get(PdfName.XOBJECT) as PdfDictionary;
if (xobjDict != null)
{
foreach (PdfName obj2Name in xobjDict.Keys)
{
_messages.Add(" obj2Name: " + obj2Name);
PRIndirectReference pirr = xobjDict.Get(obj2Name) as PRIndirectReference;
PdfDictionary dict2 = pirr.Reader.GetPdfObject(pirr.Number) as PdfDictionary;
PdfDictionary dict3 = dict2.Get(PdfName.RESOURCES) as PdfDictionary;
if (dict3 != null)
InternalDictCheck(dict3);
}
}
_currentLevel--;
}
}
}















